>, 
Infosys Science Foundation Series in Mathematical Sciences 


» re 


? 


~ RamjiLal ¢ 


Algebra 2 


Linear Algebra, Galois Theory, — 
Representation Theory, Group 
Extensions and Schur Multiplier 


z= 


ence 
Foundation 


D) Springer 


Infosys Science Foundation Series 


Infosys Science Foundation Series in Mathematical 
Sciences 


Series editors 


Gopal Prasad, University of Michigan, USA 
Irene Fonseca, Mellon College of Science, USA 


Editorial Board 


Chandrasekhar Khare, University of California, USA 

Mahan Mj, Tata Institute of Fundamental Research, Mumbai, India 
Manindra Agrawal, Indian Institute of Technology Kanpur, India 
S.R.S. Varadhan, Courant Institute of Mathematical Sciences, USA 
Weinan E, Princeton University, USA 


The Infosys Science Foundation Series in Mathematical Sciences is a sub-series of 
The Infosys Science Foundation Series. This sub-series focuses on high quality 
content in the domain of mathematical sciences and various disciplines of 
mathematics, statistics, bio-mathematics, financial mathematics, applied mathematics, 
operations research, applies statistics and computer science. All content published 
in the sub-series are written, edited, or vetted by the laureates or jury members of the 
Infosys Prize. With the Series, Springer and the Infosys Science Foundation hope to 
provide readers with monographs, handbooks, professional books and textbooks 
of the highest academic quality on current topics in relevant disciplines. Literature in 
this sub-series will appeal to a wide audience of researchers, students, educators, 
and professionals across mathematics, applied mathematics, statistics and computer 
science disciplines. 


More information about this series at http://www.springer.com/series/13817 


Ramji Lal 


Algebra 2 


Linear Algebra, Galois Theory, 
Representation Theory, Group Extensions 
and Schur Multiplier 


g) Springer 


Ramji Lal 
Harish Chandra Research Institute (HRI) 
Allahabad, Uttar Pradesh 


India 

ISSN 2363-6149 ISSN 2363-6157 (electronic) 
Infosys Science Foundation Series 

ISSN 2364-4036 ISSN 2364-4044 (electronic) 
Infosys Science Foundation Series in Mathematical Sciences 

ISBN 978-981-10-4255-3 ISBN 978-981-10-4256-0 (eBook) 


DOI 10.1007/978-98 1-10-4256-0 
Library of Congress Control Number: 2017935547 


© Springer Nature Singapore Pte Ltd. 2017 

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part 
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, 
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission 
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar 
methodology now known or hereafter developed. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this 
publication does not imply, even in the absence of a specific statement, that such names are exempt from 
the relevant protective laws and regulations and therefore free for general use. 

The publisher, the authors and the editors are safe to assume that the advice and information in this 
book are believed to be true and accurate at the date of publication. Neither the publisher nor the 
authors or the editors give a warranty, express or implied, with respect to the material contained herein or 
for any errors or omissions that may have been made. The publisher remains neutral with regard to 
jurisdictional claims in published maps and institutional affiliations. 


Printed on acid-free paper 
This Springer imprint is published by Springer Nature 


The registered company is Springer Nature Singapore Pte Ltd. 
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore 


Dedicated to the memory of 

my mother 

(Late) Smt Murti Devi, 

my father 

(Late) Sri Sankatha Prasad Lal, and 
my father like brother 

(Late) Sri Gopal Lal 


Preface 


Algebra has played a central and decisive role in all branches of mathematics and, 
in turn, in all branches of science and engineering. It is not possible for a lecturer to 
cover, physically in a classroom, the amount of algebra which a graduate student 
(irrespective of the branch of science, engineering, or mathematics in which he 
prefers to specialize) needs to master. In addition, there are a variety of students in a 
class. Some of them grasp the material very fast and do not need much of assis- 
tance. At the same time, there are serious students who can do equally well by 
putting a little more effort. They need some more illustrations and also more 
exercises to develop their skill and confidence in the subject by solving problems on 
their own. Again, it is not possible for a lecturer to do sufficiently many illustrations 
and exercises in the classroom for the aforesaid purpose. This is one of the con- 
siderations which prompted me to write a series of three volumes on the subject 
starting from the undergraduate level to the advance postgraduate level. Each 
volume is sufficiently rich with illustrations and examples together with numerous 
exercises. These volumes also cater for the need of the talented students with 
difficult, challenging, and motivating exercises which were responsible for the 
further developments in mathematics. Occasionally, the exercises demonstrating the 
applications in different disciplines are also included. The books may also act as a 
guide to teachers giving the courses. The researchers working in the field may also 
find it useful. 

The first volume consists of 11 chapters, which starts with language of mathe- 
matics (logic and set theory) and centers around the introduction to basic algebraic 
structures, viz., groups, rings, polynomial rings, and fields together with funda- 
mentals in arithmetic. This volume serves as a basic text for the first-year course in 
algebra at the undergraduate level. Since this is the first introduction to the 
abstract-algebraic structures, we proceed rather leisurely in this volume as com- 
pared with the other volumes. 

The present (Second) volume contains 10 chapters which includes the funda- 
mentals of linear algebra, structure theory of fields and the Galois theory, repre- 
sentation theory of groups, and the theory of group extensions. It is needless to say 
that linear algebra is the most applicable branch of mathematics, and it is essential 


Vii 


Viii Preface 


for students of any discipline to develop expertise in the same. As such, linear 
algebra is an integral part of the syllabus at the undergraduate level. Indeed, a very 
significant and essential part (Chaps. 1-5) of linear algebra covered in this volume 
does not require any background material from Volume 1 of the book except some 
amount of set theory. General linear algebra over rings, Galois theory, represen- 
tation theory of groups, and the theory of group extensions follow linear algebra, 
and indeed these are parts of the syllabus for the second- and the third-year students 
of most of the universities. As such, this volume together with the first volume may 
serve as a basic text for the first-, second-, and third-year courses in algebra. 

The third volume of the book contains 10 chapters, and it can act as a text for 
graduate and advance graduate students specializing in mathematics. This includes 
commutative algebra, basics in algebraic geometry, semi-simple Lie algebras, 
advance representation theory, and Chevalley groups. The table of contents gives an 
idea of the subject matter covered in the book. 

There is no prerequisite essential for the book except, occasionally, in some 
illustrations and exercises, some amount of calculus, geometry, or topology may be 
needed. An attempt to follow the logical ordering has been made throughout 
the book. 

My teacher (Late) Prof. B.L. Sharma, my colleague at the University of 
Allahabad, my friend Dr. H.S. Tripathi, my students Prof. R.P. Shukla, Prof. 
Shivdatt, Dr. Brajesh Kumar Sharma, Mr. Swapnil Srivastava, Dr. Akhilesh Yadav, 
Dr. Vivek Jain, Dr. Vipul Kakkar, and above all, the mathematics students of the 
University of Allahabad had always been the motivating force for me to write a 
book. Without their continuous insistence, it would have not come in the present 
form. I wish to express my warmest thanks to all of them. 

Harish-Chandra Research Institute (HRD, Allahabad, has always been a great 
source for me to learn more and more mathematics. I wish to express my deep sense 
of appreciation and thanks to HRI for providing me all infrastructural facilities to 
write these volumes. 

Last but not least, I wish to express my thanks to my wife Veena Srivastava who 
had always been helpful in this endeavor. 

In spite of all care, some mistakes and misprints might have crept in and escaped 
my attention. I shall be grateful to any such attention. Criticisms and suggestions for 
the improvement of the book will be appreciated and gratefully acknowledged. 


Allahabad, India Ramji Lal 
April 2017 


Contents 


Te “Vector Spaces inc geh a he nee Sas waspag nah ee IG a SX 1 
1.1 Concept of a Field. ... 0.0.0.0... 0. eee 1 
1.2. Concept of a Vector Space (Linear Space)................. 7 
1.30 - SUDSpaceS...2.4.hawtie pacts Son feat eee a ee 11 
1.4 Basis and Dimension................ 000.0000. e ee eee 16 


1.5 Direct Sum of Vector Spaces, Quotient of a Vector Space .... 23 


2 Matrices and Linear Equations........................0200. 31 
2.1 Matrices and Their Algebra... .................0..000.0. 31 
2.2. “Typésof Matrices, 252.5066 3.005 day ee oe ged he ee Mase ee 35 
2.3 System of Linear Equations.....................2.0000. 40 
2.4 Gauss Elimination, Elementary Operations, Rank, 
anid NUNN eirs see othe ate che te ket teats ot eae 43 
2:5. LU Factorization v.08 ope kee ee eae Rated eee Marks 58 
2.6 Equivalence of Matrices, Normal Form................... 60 
2.7. Congruent Reduction of Symmetric Matrices............... 65 
3 Linear Transformations................. 0.0.0.0 0c eee ee eee 73 
3.1 Definition and Examples .......................0.0000. 73 
3.2 Isomorphism Theorems ................... 000000000005 75 
3.3. Space of Linear Transformations, Dual Spaces ............. 719 
3.4 Rank and Nullity..... 2.0.0... eee 83 
3.5 Matrix Representations of Linear Transformations........... 85 
3.6 Effect of Change of Bases on Matrix Representation......... 88 
4 Inner Product Spaces.............. 0.0.00. c eee eee 97 
4.1 Definition, Examples, and Basic Properties ................ 97 
4.2 Gram-—Schmidt Process ............ 0.000000. e eee eee 107 
4.3 Orthogonal Projection, Shortest Distance.................. 112 
4.4 Isometries and Rigid Motions ......................00.0. 120 


x Contents 
5 Determinants and Forms.................. 0.0.0.0 ee ee eee 131 
5.1 Determinant of a Matrix...........0 0.0.0.0... cee eee 131 
5.2) “Permutations ./oreideen a Sate ee ae eas, REE wars 135 
5.3. Alternating Forms, Determinant of an Endomorphism........ 139 
5.4 Invariant Subspaces, Eigenvalues.......................-. 150 
5.5. Spectral Theorem, and Orthogonal Reduction .............. 159 
5.6 Bilinear and Quadratic Forms ...................0000005 176 
6 Canonical Forms, Jordan and Rational Forms................. 195 
6.1 Concept of a Module over a Ring...................00.. 195 
6.2 Modules over PID. 1.1.2... eee 203 
6.3. Rational and Jordan Forms .................0. 0000 eee 214 
7 General Linear Algebra............... 0.0.0.0. cee eee eee 229 
7.1 Noetherian Rings and Modules .....................0..0. 229 
7.2 Free, Projective, and Injective Modules................... 234 
7.3. Tensor Product and Exterior Power................20.055 250 
7.4 Lower K-theory... 2.2... 0.0.0.0... eee eee 258 
8 Field Theory, Galois Theory .................... 00.0000 00 0 265 
8.1. Freld Extensions: 2.003 oes ee aoe toe beable oe te BG Ra Dace 265 
8.2 Galois Extensions.......... 20.00... cece eee 275 
8.3 Splitting Field, Normal Extensions....................... 284 
8.4 Separable Extensions .............. 00.00.0000. eee eee, 294 
8.5 Fundamental Theorem of Galois Theory .................. 305 
8.6 Cyclotomic Extensions............... 0.00.00 000 000008 311 
8.7. Geometric Constructions ......... 0.0.0... eee 318 
8.8 Galois Theory of Equation........................0.00. 324 
9 Representation Theory of Finite Groups...................... 331 
9.1 Semi-simple Rings and Modules .....................0... 331 
9.2 Representations and Group Algebras..................... 346 
9.3 Characters, Orthogonality Relations...................... 351 
9.4 Induced Representations................ 0.00.0 00000000. 361 
10 Group Extensions and Schur Multiplier...................... 367 
10.1 Schreier Group Extensions...................0. 0000000. 368 
10.2 Obstructions and Extensions ..................0000 0 eee 391 
10.3 Central Extensions, Schur Multiplier....................0. 398 
10.4 Lower K-Theory Revisited.............. 0.0.0.0 .00.000. 418 
Bibliography’? :3..5¢:80c. f3.a2 nayed Peis Ana io ai aed Baek REE: 427 


About the Author 


Ramji Lal is Adjunct Professor at the Harish-Chandra Research Institute (HRD, 
Allahabad, Uttar Pradesh. He started his research career at the Tata Institute of 
Fundamental Research (TIFR), Mumbai, and served at the University of Allahabad 
in different capacities for over 43 years: as a Professor, Head of the Department, and 
the Coordinator of the DSA Program. He was associated with HRI, where he 
initiated a postgraduate (PG) program in mathematics and coordinated the Nurture 
Program of National Board for Higher Mathematics (NBHM) from 1996 to 2000. 
After his retirement from the University of Allahabad, he was Advisor cum Adjunct 
Professor at the Indian Institute of Information Technology (IIIT), Allahabad, for 
over 3 years. His areas of interest include group theory, algebraic K-theory, and 
representation theory. 


xi 


Notations from Algebra 1 


(a) 
alb 
aw~b 
A’ 

A* 
Aut(G) 


det 


Cyclic subgroup generated by a, p. 122 

a divides b, p. 57 

a is an associate of b, p. 57 

The transpose of a matrix A, p. 200 

The hermitian conjugate of a matrix A, p. 215 
The automorphism group of G, p. 105 

The alternating group of degree n, p. 175 
Borel subgroup, p. 187 

The centralizer of H in G, p. 159 

The field of complex numbers, p. 78 

The dihedral group of order 2n, p. 90 
Determinant map, p. 191 

Semigroup of endomorphisms of G, p. 105 
Image of A under the map f, p. 34 

Inverse image of B under the map f, p. 34 
Restriction of the map f to Y, p. 30 
Transvections, p. 200 

Fitting subgroup, p. 353 

Greatest common divisor, p. 58 

Greatest lower bound, or inf, p. 40 

The set of left(right) cosets of G mod H, p. 135 
The quotient group of G modulo A, p. 151 
The index of A in G, p. 135 

Order of G, p. 331 

Commutator subgroup of G, p. 403 

nth term of the derived series of G, p. 345 
General linear group, p. 186 

Identity map on X, p. 30 

Inclusion map from Y, p. 30 

The group of inner automorphisms, p. 407 


XIV Notations from Algebra | 


ker f The kernel of the map f, p. 35 
L,(G) nth term of the lower central series of G, p. 281 
Lem. Least common multiple, p. 58 
Lu.b. Least upper bound, or sup, p. 40 
M,(R) The ring of n x n matrices with entries in R, p. 350 
N Natural number system, p. 21 
No(H) Normalizer of H in G, p. 159 
O(n) Orthogonal group, p. 197 
Od, n) Lorentz orthogonal group, p. 201 
PSO, n) Positive special Lorentz orthogonal group, p. 201 
Q The field of rational numbers, p. 74 
Qs The quaternion group, p. 88 
R The field of real numbers, p. 75 
R(G) Radical of G, p. 346 
Sh Symmetric group of degree n, p. 88 
Sym(X) Symmetric group on X, p. 88 
Ss The group of unit quaternions, p. 92 
(S) Subgroup generated by a subset S, p. 116 
SL(n, R) Special linear group, p. 196 
SO(n) Special orthogonal group, p. 197 
SOC, n) Special Lorentz orthogonal group, p. 201 
SP(2n, R) Symplectic group, p. 202 
SU(n) Special unitary group, p. 202 
U(n) Unitary group, p. 202 
Un Group of prime residue classes modulo m, p. 100 
V4 Kleins four group, p. 102 
X/R The quotient set of X modulo R, p. 36 
R, Equivalence class modulo R determined by x, p. 27 
xt Successor of X, p. 20 
pa The set of maps from Y to X, p. 34 
Cc Proper subset, p. 14 
0(X) Power set of X, p. 19 
argc; Direct product of groups Gz, 1<k<n, p. 142 
a Normal subgroup, p. 147 
<i Subnormal subgroup, p. 332 
Z(G) Center of G, p. 108 
Zm The ring of residue classes modulo m, p. 256 
p(n) The number of partition of n, p. 172 
H<K Semidirect product of H with K, p. 204 
VA Radical of an ideal A, p. 286 
R(G) Semigroup ring of a ring R over a semigroup G, p. 238 
R(X] Polynomial ring over the ring R in one variable, p. 240 


R{X,,X2,---,Xn] Polynomial ring in several variables, p. 247 
LU The Mobius function, p. 256 


Notations from Algebra | 


oO Sum of divisor function, p. 256 

(2) Legendre symbol, p. 280 

P 
Stab(G, X) Stabilizer of an action of G on X, p. 295 
Gy, Isotropy subgroup of an action of G at x, p. 295 
x? Fixed point set of an action of G on X, p. 296 
Z,(G) nth term of the upper central series of G, p. 351 


®(G) The Frattini subgroup of G, p. 355 


XV 


Notations from Algebra 2 


B2(K,H) 
C(A) 

Ch(G, K) 
Ch(G) 
dim(V) 

EXT 

E(A, K) 
E,W E> 
EXT,,(H, K) 


E(V) 
FACS 
F(X) 
G(LIK) 
GAG 


Group of 2 co-boundaries with given a, p. 385 

Column space of A, p. 42 

Set of characters from G to K, p. 278 

Character ring of G, p. 350 

Dimension of V, p. 18 

Category of Schreier group extensions, p. 368 

The set of equivalence classes of extensions of H by K, p. 376 
Baer sum of extensions, p. 388 

Set of equivalence classes of extensions associated to abstract 
kernel ~, p. 384 

Exterior algebra of V, p. 257 

Category of factor systems, p. 375 

The fixed field of a set of automorphism of a field, p. 275 
The Galois group of the field extension L of K, p. 275 
Non-abelian exterior square of a group G, p. 413 
Algebraic closure of K, p. 289 

Second cohomology with given a, p. 385 

Grothendieck group of the ring R, p. 257 

Whitehead group of the ring R, p. 260 

Separable closure of K in L, p. 295 

Field extension L of K, p. 262 

Minimum polynomial of linear transformation T, p. 212 
Minimum polynomial of a over the field K, p. 265 
Group of rigid motion on V, p. 122 

Tensor product of R-modules M and N, p. 250 

Norm map from L to K, p. 279 

Null space of A, p. 41 

Obstruction of the abstract kernel ~, p. 393 

Row space of A, p. 42 

Steinberg group, p. 422 


XVii 


Notations from Algebra 2 


rth symmetric power of V, p. 345 

Trace map from L to K, p. 314 

Tensor algebra of V, p. 257 

Semi-simple part of T, p. 219 

Nilpotent part of T, p. 220 

Group of 2 co-cycles with given o, p. 385 

rth exterior power of V, p. 255 

Abstract kernel associated to the extension E, p. 377 
Direct sum of representations p and 7, p. 345 
Tensor product of representations p and 7, p. 345 
rth symmetric power of the representation p, p. 345 
Set of all intermediary fields of L/K, p. 275 

rth exterior power of the representation p, p. 345 
Character afforded by the representation p, p. 350 
nth cyclotomic polynomial, p. 311 

Characteristic polynomial of A, p. 149 


Chapter 1 
Vector Spaces 


This chapter is devoted to the structure theory of vector spaces over arbitrary fields. 
In essence, a vector space is a structure in which we can perform all basic operations 
of vector algebra, can talk of lines, planes, and linear equations. The basic motivating 
examples on which we shall dwell are the Euclidean 3-space R? over R in which 
we live, the Minkowski Space R* of events (in which the first three coordinates 
represent the place and the fourth coordinate represents the time of the occurrence 
of the event), and also the space of matrices. 


1.1 Concept of a Field 


Rings and fields have been introduced and studied in Algebra 1. However, to make the 
linear algebra part (Chaps. 1-5) of this volume independent of Algebra 1, we recall, 
quickly, the concept of a field and its basic properties. Field is an algebraic structure 
in which we can perform all arithmetical operations, viz., addition, subtraction, mul- 
tiplication, and division by nonzero members. The basic motivating examples are the 
structure Q of rational numbers, the structure R of real numbers, and the structure 
C of complex numbers with usual operations. The precise definition of a field is as 
follows: 


Definition 1.1.1 A Field is a triple (F,+,-), where F is a set, + and - are two 
internal binary operations, called the addition and the multiplication on F’, such that 
the following hold: 

1. (F, +) is an abelian Group in the following sense: 

(i) The operation + is associative in the sense that 

(a+b)+c=a4+ (b+ 0c)foralla, b,c €F. 

(ii) The operation + is commutative in the sense that 

(a+b) = (6+ a)foralla, b €F. 


© Springer Nature Singapore Pte Ltd. 2017 1 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_1 


2 1 Vector Spaces 


(iii) There is a unique element 0 € F, called the zero of F, such that 

a+0=a=0+4+ aforalla €F. 

(iv) For all a € F, there is a unique element —a € F, called the negative of a, 
such that 

a+ (-a) =0=-a+a. 

2. (i) The operation - is associative in the sense that 

(a: b)-c=a-(b.-c)foralla, b,c €F. 

(ii) The operation - is commutative in the sense that 

(a:b) = (b- a)foralla, b €F. 

3. The operation - distributes over + in the sense that 

(Gja-(b+ c) = a-b + a-c,and 

Gi) (a + b)-c = a-c+ )-cforalla,b,c€ F. 

4. (i) There is a unique element | € F — {0}, called the one of F, such that 

l-a=ae=a_-\forallaeF. 

(ii) For alla € F — {0}, thereisauniqueelementa™! € F, called the multiplicative 
inverse of a, such that 


aq-a!=1l=a''!-a. 


Before having some examples, let us observe some simple facts: 
Proposition 1.1.2 Let (F, +, -) bea field. 


(i) The cancellation law holds for the addition + in F in the sense that (a+b = 
a+c) implies b= c. In turn, (b+a = c+ a) implies b =c. 
(ii) a-0 = 0 = O-aforallae F. 
(iii) a- (—b) = —(a-b) = (—a)- bforalla,be F. 
(iv) The restricted cancellation for the multiplication in F holds in the sense that 
(a 40 and a-b = a-c) implies b=c. In turn, (a#40 and b-a = c- 
a) implies b=. 
(v) (a-b = O) implies that (a = 0 or b= 0). 


Proof (i) Suppose thata+b = a+c. Thenb = 0+b = (-a+a)+b) = 
-—a+(a+b) = -a+(a+c) = (-at+a)+ec = O+c = c. 

gij0 + a-0 =a-0 = a-(0+0) = a-0+a-0. Using the cancellation for +, 
we get thatO = a-0. Similarly,0 = 0-a. 

(ii)O = a-0 = a-(b+(—b)) = a-b + a-(—D). It follows that a - (—b) 
—(a-b). Similarly, the other part follows. 

(iv) Suppose that a #0 anda-b = a-c. Thenb = 1-b = (a!-a)-b = 
a!-(a-b) = a!-(a-c) = (a!-a)-c = 1-c = c. Similarly, the other part 
follows. 

(v) Suppose that (a-b = 0). If a= 0, there is nothing to do. Suppose that a 4 0. 
Thena-b = 0 = a-0. From (iv), it follows that b = 0. t 


Integral Multiples and the Integral Powers of Elements of a Field 
Let a € F. For each natural number n, we define the multiple na inductively as fol- 
lows: Define la = a. Assuming that na is defined, define (n+ l)a = na+a. 


1.1 Concept of a Field 3 


Thus, for a natural numbern, na = a+a+ ---+ a. Wedefine Oa = O. Further, 
Se 
ntimes 
ifm = —n is a negative integer, then we define ma = n(—a). Thus, for a negative 
integerm = —n,ma = —a+(—a)+ ---+ (—a). This defines the integral multi- 
a 


ntimes 


ple na for each integer n. Similarly, we define all integral powers of a nonzero element 
a of F as follows: Define a! = a. Assuming that a” has already been defined, define 
a’*! = aq" .a. This defines all positive integral powers of a. Define a® = 1, and 
for negative integern = —m, define a’ = (a~!)". The following law of exponents 
follow immediately by the induction. 


Gi) (n+m)a = na+mafor all n,m € Z. 

(ii) (nm)a = na) for all n,m € Z. 
(ii) a" = a"-a"™ for alla € F — {0}, andn,me Z. 
(iv) a” = (a")" for alla € F — {0}, andn,me Z. 


Examples of Fields 


Example 1.1.3 The rational number system Q, the real number system R, and the 
complex number system C with usual addition and multiplications are basic examples 
of a field. 


Example 1.1.4 Consider F = Q(/2) = {a+b</2 | a,b € Q}. The addition and 
multiplication in R induce the corresponding operations in Q(/2). We claim that 
Q(V2) is a field with respect to the induced operations. All the defining properties of 
a field are consequences of the corresponding properties in R except, perhaps, 4(ii) 
which we verify. Let a, b € Q such thata + b./2 4 0. We claim that a? — 2b" #0. 
Suppose not. Then a? — 2b” = 0. In turn, b = 0 (and so also a = 0), otherwise, 
(¢)? = 2,a contradiction to the fact that ./2 is not a rational number. Thus, then 


1 — a-bV¥2 a —b ieee 
atb/2 — a2 —2b2 — a-—2b2 + a-—2b2 V2 is in Q(V2). 


Remark 1.1.5 There is nothing special about 2 in the above example, indeed, we can 
take any prime, or for that matter any rational number in place of 2 which is not a 
square of a rational number. 


So far all the examples of fields are infinite. Now, we give an example of a finite 
field. 

Let p be a positive prime integer. Consider the set Z, = {1,2,...,p— 1} of 
residue classes modulo a prime p. Clearly,a@ = 7, where r is the remainder obtained 
when a is divided by p. The usual addition 6 modulo p, and the multiplication « 
modulo p are given by 

i®j=itj, ijeZ, 


and —— —_ 
ixj=iJ,i,j¢E€Z 


4 1 Vector Spaces 


For example, in Z11,6 @ 7 = 13 = 2. Similarly, the product6 * 7 = 42 = 9. 
We have the following proposition. 


Proposition 1.1.6 For any prime p, the triple (Z,, ©, *) introduced above is a field 
containing p elements. 


Proof Clearly, 1 is the identity with respect to *. We verify only the postulate 4(ii) 
in the definition of a field. The rest of the postulates are almost evident, and can be 
verified easily. In fact, we give an algorithm (using Euclidean Algorithm) to find the 
multiplicative inverse of a nonzero element i € Zp. Let ie Zp — {0}. Then p does 
not divide i. Since p is prime, the greatest common divisor of i and p is 1. Using the 
Euclidean algorithm, we can find integers b and c such that 


1l=i-b+p-c. 


Thus, 1 = i-b = ixb. It follows that b is the inverse of i with respect to x. tt 


The above proof is algorithmic and gives an algorithm to find the multiplicative 
inverse of nonzero elements in Z,. 


Definition 1.1.7 Let (F, +, -) bea field. A subset L of F is called a subfield of F 
if the following hold: 


(i) OEL. 
di) Ifa,beL,thena+beLanda-beL. 
(iii) le L. 
(iv) Foralla e L,—-aeéL. 
(v) Forallae L—{0},a' EL. 


Thus, a subfield L of a field F is also a field at its own right with respect to the 
induced operations. The field F is a subfield of itself. This subfield is called the 
improper subfield of F’. Other subfields are called proper subfields. The set Q of 
rational numbers, the set Q(./2) described in Example 1.1.4, are proper subfields of 
the field R of real numbers. The field R of real numbers is a subfield of the field C 
of complex numbers. 


Proposition 1.1.8 The field Q of rational numbers, and the field Z, have no proper 
subfields. 


Proof We first show that Q has no proper subfields. Let L be a subfield of Q. Then 
by the Definition 1.1.7(iii), 1 ¢ L. Again, by (ii), n = I +1+ --- +1 belongs to 


—ae 
$< 
n 


L for all natural numbers n. Thus, by (iv), all integers are in L. By (v), i € L for 
all nonzero integers n. By (ii),  € L for all integers m,n; n 4 0. This shows that 


i= G, 


1.1 Concept of a Field 5 


- Next, let L be a subfield of Z,. Then by the Definition 1.1.7(iii), 12h By (ii), 
i=1@61@6--- © 1 belongs to L for alli € Z,. This shows thatL = Z,. 
<< 


We shall see that, essentially, these are the only fields which have no proper 


subfields. Such fields are called prime fields. 
Homomorphisms and Isomorphisms Between Fields 


Definition 1.1.9 Let F, and F, be fields. A map f from F to F> is called a 
fieldhomomorphism if the following conditions hold: 


G) fa + b) = f@@ + fb) for all a,b € F; (note that + in the LHS is the 

addition of F, and that in RHS is the addition of F2). 

(ii) f(a - b) = f(a) - f(b) foralla, b € F; (again - in the LHS is the multiplication 
of F), and that in RHS is the multiplication of F2). 

(ii) f(1) = 1, where | in the LHS denotes the multiplicative identity of F,, and 1 
in RHS denotes the multiplicative identity of F. 
A bijective homomorphism is called an isomorphism. A field F| is said to be 
isomorphic a field F> if there is an isomorphism from F to F>. 


We do not distinguish isomorphic fields. 


Proposition 1.1.10 Let f be a homomorphism from a field F\ to a field Fy. Then, 
the following hold. 


(i) f(0) =0, where Oin the LHS is the zero of F\, and Oin the RHS is the zero of 
F). 
(ii) f(—a) = —f(q@) forallaeé F,. 
(iii) f(na) = nf (a) for alla € F,, and for all integer n. 
(iv) f(a") = (f(@)" for all a € F, — {0}, and for all integer n. 
(v) f is injective, and the image of F under f is a subfield of Fy which is isomorphic 
to F\. 


Proof 4)0 + f(0) = fO) = f( + 0) = f(O) + f(O). Using cancellation law 
for addition in F2, we get that f(0) = 0. 

Gi)O = f(O) = f(a + (—a)) = f(a) + f(—a). This shows that f(—a) = —f(a). 
(iii) Suppose that n = 0. Then Of(a) = 0 = f(O) = f(Oa). Clearly, f(a) = 
f@ = lf(@. Assume that f(na) = nf(a) for a natural number n. Then f(m + 
Da = f(na+a) = finahb+f@ = nf(a+f@ = (n+ If. By induction, 
it follows that f(na) = nf (a) for all a € Fj, and for all natural number n. Suppose 
that = —m is a negative integer. Then, f(na) = f((—m)a) = f(—(ma)) = 
—f(ma) = —(mf(a)) = —(m f(a) = nf@. 

(iv) Replacing na by a”, imitate the proof of (iii). 

(v) Suppose that a #4 b. Then (a—b) #0. Now, 1 = fd) = f((a—b)(a- 
b)"') = f(a—b)f((a—b)~!). Since 1 40, it follows that (f(a) —f(b)) = 
f(a—b) #0. This shows that f(a) A f(b). Thus, f is injective, and it can be real- 
ized as a bijective map from F; to f(F;). It is sufficient, therefore, to show that 
f(F)) is a subfield of Fy. Clearly,0 = f(0), and 1 = f(1) belong to f(F)). Let 


6 1 Vector Spaces 


f(a), f(b) €f(F1), where a,b € Fy. Then (f(a) + f(b) = fla+b) €f(Fi), 
and also (f(a)f(b)) = f(ab) € f(F}). Finally, if f(a) 4 0, then a € F; — {0}. But, 
then (f(a))"! = f(a!) € Fy. t 


Characteristic of a Field 
Let F be a field. Consider the multiplicative identity | of F. There are two cases: 


(i) Distinct integral multiples of 1 are distinct, or equivalently,n1 = m1 implies that 
n = m. This is equivalent to say thatn! = Oifand only ifn = 0. In this case we 
say that F is of characteristic 0. Thus, for example, the field R of real numbers, 
the field Q of rational numbers, and the field C of complex numbers are the fields 
of characteristic 0. 

(ii) Not all integral multiples of 1 are distinct. In this case there exists a pairn, m of 
distinct integers such thatnl = ml. But, then, (n-—m)1 = 0 = (m—n)l1. 
In turn, there is a natural number / such that /1 = 0. In this case, the smallest 
natural number 7 such that /1 = 0 is called the characteristic of F. Thus, the 
characteristic of Z, is p. 


Proposition 1.1.11 The characteristic of a field is either 0 or a prime number p. 
A field of characteristic 0 contains a subfield isomorphic to the field Q of rational 
numbers, and a field of characteristic p contains a subfield isomorphic to the field 
Zp. 


Proof Suppose that F is a field of characteristic 0. Thenn! = m1 implies thatn = m. 
Also (m1 # 0) if and only if (m 4 0). Suppose that (7* = ©). Then (m1)(sl) = 
ms1 = nrl = (nl1)(r1). Inturn, ((m1)(n1)—! = (r1)(s1)~!). Thus, we have a map 
f from Q to F given by f() = (ml)(n1)~!. Next, suppose that ((m1)(n1)"! = 
(r1)(s1)7!). Thenms1 = (m1)(s1) = (n1)(71) = nr1. This means that ms = nr, 
or equivalently, (~ = =). This shows that f is an injective map. It is also straight 
forward to verify that f is a field homomorphism. Thus, L = {(ml)(nl)~' | me 
Z, n € Z — {0}} is a subfield of F which is isomorphic to Q. 

Next, suppose that the characteristic of F is 1 ~ 0. Then / is the smallest natural 
number such that /1 = 0. We show that / is a prime p. Suppose not. Then / = 
ib, 1 <1) < 1,1 < bh < 1.But,thenO = /1 = (hb)1 = G1)(b1). Inturn, 
1,1 = Oorl,1 = O. This is acontradiction to the choice of /. Thus, the characteristic 
of F is a prime p. Suppose that i = j. Then p divides i — j. Inturn, (i — j)1 = 0, 
and soil = jl. Thus, we have a map f from Z, to F defined byf@ = il. Clearly, 
this is an injective field homomorphism. ft 


Exercises 


1.1.1 Show that QWw) = {a + bw | a,b € Qj, where w a primitive cube root of 
1, is a subfield of the field C of complex numbers. 


1.1.2 Show that V /2 is not a member of Q(./2). Use the method of Example 1.1.4 
to show that Q(/2)(vV V2) = {a + bV2 + (c + dV2)(V V2) | a,b,c,d€Q 


is a field with respect to the addition and multiplication induced by those in R. 
Generalize the assertion. 


1.1 Concept of a Field 7 


1.1.3 Show that Q(/2)(V3) = {a + bV2 + (c + dV2)(V3) | a,b, c,d € Q} 


is a field with respect to the addition and multiplication induced by those in R. 


1.1.4 Show that Q(23) = {a + b2? + c23 | a,b,c € Q} is also a field with 
respect to the addition and multiplication induced by those in R. Express cae as 
1423 


act b23 + 023, a,b,cEeQ. 


1.1.5 Show that F = {0, 1, a, a?} is a field of characteristic 2 with respect to the 
addition + and multiplication - given by the following tables: 


Qle}ol+ 
QlRl ojo 
oO] e| Re 


=) 
Ny 


Flo} Ri a}9 


Q 
nN 

Q 
Ny 

Q 


Q}ole 


Q 
wy 


ol Oo] Oo] CO] © 
Rlo}lelol— 


0 

1 

a 
a 


a 


1.1.6 Find the multiplicative inverse of 20 in Zo57, and also find the solution of 
10x @ 2 = 3. 


1.1.7 Write a program in C++ language to check if a natural number n is prime, and 
if so to find the multiplicative inverse of a nonzero element m in Z,,. Find the output 
withn = 27 +1,andm = 641. 


1.2 Concept of a Vector Space (Linear Space) 


Consider the space (called the Euclidean 3-space) in which we live. If we fix a point 
(place) in the three space as origin together with three mutually perpendicular lines 
(directions) passing through the origin as the axes of reference, and also a segment of 
line as a unit of length, then any point in the 3-space determines, and it is determined 
uniquely by an ordered triple (a, (3, y) of real numbers. 


8 1 Vector Spaces 


P(a, 8,7) 


xX 


Thus, with the given choice of the origin and the axes as above, the space in which 
we live can be represented faithfully by 


R? = (% = (x1,%2,%3) | x1,22,%3 € R}, 


and it is called the Euclidean 3-space. The members of R? are called the usual 3- 
vectors. It is also evident that the physical quantities which have magnitudes as well 
as directions (e.g., force, velocity, or displacement) can be represented by vectors. 
More generally, for a fixed natural number n, 


R" = (% = (1, 4%,.--, Xn) | X1,%2,---5Xn € R} 


is called the Euclidean n-space, and the members of the Euclidean n-space are called 
the Euclidean n-vectors. We term x,, x2, ...,X, aS components, or coordinates of 
the vector ¥ = (xX},x%2,...,%,). Thus, R? represents the Euclidean plane, and R* 
represents the Minkowski space of events in which the first three coordinates rep- 
resent the place, and the fourth coordinate represents the time of the occurrence of 
the event. R! is identified with R. By convention, R° = {0} is a single point. We have 
the addition + in R”, called the addition of vectors, and it is defined by 


X+Y = + y1,%2 +y2,---5Xn +n); 


where ¥ = (%1,%2,.-.,X,) and Y = (jj, y2,..-,¥,). We have also the external 
multiplication - by the members of R, called the multiplication by scalars, and it is 
given by 

a-X = (ax), 0%,...,a%,),aER. 


1.2 Concept of a Vector Space (Linear Space) 9 


Remark 1.2.1 The addition + of vectors in 3-space R? is the usual addition of vectors, 
which obeys the parallelogram law of addition. 


The Euclidean 3-space (R’, +, -) introduced above is a Vector Space in the 
sense of the following definition: 


Definition 1.2.2 A Vector Space (also called a Linear Space) over a field F (called 
the field of Scalars) is a triple (V, +, -), where V is a set, + is an internal binary 
operation on V, called the addition of vectors, and -: F x V —> V is an external 
multiplication, called the multiplication by scalars, such that the following hold: 

A. (V, +) is an abelian group in the sense that: 


1. +s associative, i.e., 


(x+y)+z2 = x+(y+z) 


for all x, y,zin V. 
2. +1S commutative, i.e., 
X+y = ytx 


for all x, yin V. 
3. We have a unique vector 0 in V, called the null vector, and it is such that 


x+0=x= 04x 
for all x in V. 
4. For each x in V, we have a unique vector —x in V, called the negative of x, and 


it is such that 
x+(—x) =0= (—x) +x. 


B. The external multiplication - by scalars satisfies the following conditions: 


1. It distributes over the vector addition + in the sense that 
a-(x+y)=a-x+a-y 


for alla € Fandx, yinV. 
2. It distributes over the addition of scalars also in the sense that 


(a+ B)-x=a-x+6-x 
for alla, @ € FandxinV. 


3. (a@B)-x =a-(G-x) foralla,@¢€ F andxinV. 
4. 1-x =x forallxin V. 


Example 1.2.3 Let F be a field, and n be a natural number. Consider the set 


V=F = {x = (%1,%2) +++ 5 Xn) | X1,%2, 0-4) Xn € F} 


10 1 Vector Spaces 


of row vectors with n columns, and with entries in fF’. We have the addition + in F” 
defined by 


X+Y = OW + 1,42 +y2,---5Xn +n); 


where X¥ = (1, %2,.-.,X,) and y = (1, y2,.--,Y,). We have also the external 
multiplication - by the members of F' defined by 


a-X = (AX, AX2,...,A%), AE F. 


The field properties of F ensures that the triple (F”, + -) is a vector space over F’. 
The zero of the vector space is the zero row 0 = (0,0, ...,0), and the negative of 
X = (Xj, X,...,X,) IS —X = (—X,, —X2,..., —X,). We can also treat the members 
of F” as column vectors. 


Example 1.2.4 Let L be a subfield of a field F. Consider (F, +, -), where + is the 
addition of the field F’, and - is the restriction of the multiplication in F to L x F. 
Then it is evident that (F, +, -) is a vector space over L. Thus, every field can be 
considered as vector spaces over its subfields. 


Example 1.2.5 Let C[O, 1] denote the set of all real valued continuous functions on 
the closed interval [0, 1]. Since sum of any two continuous functions is a continuous 
function, we have an addition on C[0, 1] with respect to which it is an abelian group. 
Define the external multiplication - by (a- f)(x) = a-f(x). Then C[0, 1] is a vector 
space over the field R of reals. Note that the set D[O, 1] of differentiable functions is 
also a vector space over the field R of reals with respect to the addition of functions, 
and multiplication by scalars as defined above. 


Example 1.2.6 Let P,,(F) denote the set of all polynomials of degree at most n over 
a field F. Then P,,(F) is an abelian group with respect to the addition of polynomials. 
Further, ifa € F andf(X) € P,,(F), then af(X) € P,,(F). Thus, P,,(F) is also a vector 
space over F’. 


Proposition 1.2.7 Let V be a vector space over a field F. Then the following hold: 


(i) The cancellation law holds in (V, +) in the sense that (x+y = x+ 
z) implies y = z(Inturn, (y+x = z+x) implies y = 2). 
(ii) O-x = O, where 0 in the left side is the 0 of F, 0 on right side is that of V, and 


xev. 
(iii) a@-O = 0, where both 0 are that of V, anda € F. 
(iv) (-a)-x = —(a-x) foralla € F, andx € V. In particular (—1).x = —x. 


(v) (a-x = 0) implies that (a = Oorx = 0). 


Proof (i) Suppose that (x+y = x+z). Theny = O+y = (-x+x)+y = 
—X+ (x+y) = —X4+ (42) = (-x1+x)4+2 = 04+72= 2. 
Gi0+0-x = 0-x = (0+0)-x = 0-x 4+ 0-x. By the cancellationin (V, +), 


1.2 Concept of a Vector Space (Linear Space) 11 


0 = 0-x. 

gi)O0O + a-0 = a-0 =a-(0 + 0) = a-0 4+ a-O. By the cancellation in 
(V, +),0 = a-0. 

(iv)0 = 0-x = (-a+a)-x = (—a)-x + a-x. This shows that (—a)-x = 
—(a@-Xx) 

(v) Suppose that (a-x = 0),anda #40. Then,x = 1-x = (ata)-x =a!. 
(a-x) =a!-0=0. tt 


1.3. Subspaces 


Definition 1.3.1 Let V be a vector space over a field F. A subset W of V is called 
a subspace, or a linear subspace of V if 


G) OE W. 
(ii) x+y € W for allx,y € W. 
(iii) a-x € W foralla € F andx ce W. 


Thus, a subspace is also a vector space over the same field at its own right. 


Proposition 1.3.2 Let V be a vector space over a field F. Then a nonempty subset 
W of V is a subspace if and only if ax + by € W foralla,b € F,andx,y € V. 


Proof Suppose that W is a subspace of V. Leta, b € F,andx, y € V. From the Defi- 
nition 1.3.10), ax, by € W. Inturn, by Definition 1.3.1 (ii), ax + by € W. Conversely, 
suppose that W is a nonempty subset of V such that ax + by € W for alla,be F, 
and for all x,y € W. Letx,y e W. Thenx+y = 1x + ly belongs to W. Further, 


since W is nonempty, there is an element x ¢ W, and then 0 = Ox + Ox belongs 
to W. Also for x € W, andae F, ax = ax+O0x € W. This shows that W is a 
subspace of V. i 


Example 1.3.3 Let V bea vector space over a field F. Then V is clearly a subspace of 
V, and it is called an improper subspace of V. The singleton {0} is also a subspace 
of V, and it is called the trivial subspace of V. Other subspaces of V are called 
Proper subspaces of V. 


Example 1.3.4 (Subspaces of R? over R) Let W be a nontrivial subspace of R?. 
Then there is a nonzero element (/,m) € W. Since W is a subspace, a: (/,m) = 
(al, am) € W foralla € R. Thus, Wi, = {(al, am) | a € R} C W. Wi, is easily 
seen to be a subspace of R?. Indeed, W/,, is the line in the plane R? passing through 
origin and the point (J, m). Note that all lines in R? are of this type. Suppose that 
W 4 Wim. Then there is a nonzero element (p,q) in W — Wy. We claim that 
ql — pm # O. Suppose that gl — pm = 0. Since (/,m) 4 (0,0), 1 #AOorm 40. 
Suppose that / 4 0. Then, (p,q) = 1 ; om) turns out to be in W),,, a contradiction 
to the choice of (p, qg). Similarly, if m 4 0, then (p, g) = (£1, £m), acontradiction. 


m’?m 


Now, let (a, b) be an arbitrary member of R?. Since gl — pm # 0, we can solve the 


12 1 Vector Spaces 


pair of equations al + Bp = aandam + ({q = b. In other words, (a,b) = 
a(l,m) + (p,q) belongs to W, and so W = R?. This shows that only proper 
subspaces of R? are the lines passing through origin. 


Example 1.3.5 (Subspaces of R? over R) As in the above example, lines and planes 
passing through origin are proper subspaces of R? over R. Indeed, they are the only 
proper subspaces. 


Proposition 1.3.6 Intersection of a family of subspaces is a subspace. 


Proof Let {W. | a € A} bea family of subspaces of a vector space V over F. Then 
0 € W, for all a, and so 0 belongs to the intersection of the family. Thus, the 
intersection of the given family is nonempty. Let x,y € (\,<, Wa, and a,be F. 
Then x, y € W, for all a. Since each W,, is a subspace, ax + by € W, for all a. 
Hence ax + by belongs to the intersection. This shows that the intersection of the 
family is a subspace. ft 


Proposition 1.3.7 Union of subspaces need not be a subspace. Indeed, the union 
W, U W2 of two subspaces is a subspace if and only if W, © W or W2 © Wy. 


Proof If Wi © Wo, then W; [J W2 = W2 a subspace. Similarly, if W. C Wi, then 
also the union is a subspace. Conversely, suppose that W, ) W2 is a subspace and W; 
is nota subset of W2. Then there is an elementx € Wj whichis notin W2. Let y € Wo. 
Then, since W; |) W2 is asubspace,x + y € W; J W2. Nowx + y does not belong to 
W3, for otherwise x = (x + y) — y will be in W2, acontradiction to the supposition. 
Hence x + y € W,. Since x € W, and W, is subspace, y = —x+ (x+y) belongs 
to W,. This shows that W2 C Wj. tt 


Proposition 1.3.8 Let W, and W2 be subspaces of a vector space V over a field F.. 
Then W, + Wr = {x+y|x € Wi, y € Wy} is also a subspace (called the sum of 
W, and W,) which is the smallest subspace containing W, |) Wo. 


Proof Since 0 € W2, x € W; implies thatx = x+0¢€W,; + W). Thus, W; C 
W, + W2. Similarly, W2 C W; + Wp. Also, if L is a subspace containing W; J Wo, 
then x + y € L for all x € Wi, and y € Wy). Therefore, it is sufficient to show that 
W, + W2isasubspace. Clearly, W, + W2 4 Y.Letx + yandu + vbelongto W; + 
W>, wherex, u € W,, andy, v € W2. Since W, and W, are subspaces, ax + Gu € Wi, 
and ay + Gv € Wo. But, thena(x+y) + Blu+v) = (ax+ Gu) + (ay+t+ Gv) 
belongs to W; + Wo. tt 


Definition 1.3.9 A family {W, | @ € A} of subspaces of a vector space V over a 
field F is called a chain if for any given paira, 3 € A, W, C Wg, or Wg © Wg. 


Proposition 1.3.10 Union of a chain of subspaces is a subspace. 


Proof Let {W. | a € A} be a chain of subspaces of a vector space V over a field 
F. Clearly,0 € Unea Wa. Let x,y € Une, Wa, and a, 8 € F. Then x € W,, and 
y € Wz for some a, 3 € F. Since the family is a chain, W, © Wg, or Wg C Wg. 


1.3. Subspaces 13 


This means thatx, y € Wa, or x, y € Wg. Since W, and W, are subspaces, ax + (Gy 
belongs to W, or to Wg. It follows that ax + Gy E€ Nhe A Wea. This shows that 
Usea Wa is a subspace. tt 


Subspace Generated (Spanned) by a Subset 


Definition 1.3.11 A subset S of a vector space V over a field F need not be a 
subspace, for example, it may not contain 0. The intersection of all subspaces of 
V containing S' is the smallest subspace of V containing S. This subspace is called 
the subspace generated (spanned) by S, and it is denoted by < § >. If << S>= V, 
then we say that S generates V, or S is a set of generators of V. A vector space V 
is said to be finitely generated if it has a finite set of generators. 


Clearly, < 6 > = {0}. 


Remark 1.3.12 Thesubspace < S > of V generated by S is completely characterized 
by the following 3 properties: 


(i) < S > is a subspace. 
(ii) < S > contains S. 
(iii) If W is a subspace containing S, then < S >C W. 


Definition 1.3.13 Let S be a nonempty subset of a vector space V over a field F. 
An element x € V is called a linear combination of members of S if 


X = AX, + AX. + e+ + AX, 


for some dj, 42,...,d, € F and x1,%2,...,X%, € V. We also say that x depends 
linearly on S. 


Remark 1.3.14 If S is a nonempty set, then 0 is always a linear combination of the 
members of S, forO = Ox. All the members of S are linear combination of members 
of S, for any x € S is 1x. Further, if x is a linear combination of members of S$, and 
S C T, then x is also a linear combination of members of T. A Linear combination 
of linear combinations of members of S is again a linear combination of members of 
S. 


Proposition 1.3.15 Let S be a nonempty subset of a vector space V over a field F. 
Then < S > is the set of all linear combinations of members of S. 


Proof Let W denote the set of all linear combinations of members of S. Since mem- 
bers of S are also linear combinations of members of S, it follows thatS C W. Thus, W 
is nonempty set. Let x = a,x; + doxX2 + +++ GyX, andy = byy, + boyz +--+ DnYm 
be members of W, and a, b € F. Then 


ax + by = AX) + 2X2 +++ + AnXy + by + boy ait? -bnYm, 


being a linear combination of members of S, is again a member of W, and so W is a 
subspace of V. Let L be a subspace of V containing S. It follows, by induction on r, 


14 1 Vector Spaces 


that any linear combination a)x; + dox2 + --- + a,x, belongs to L. Thus, W is the 
smallest subspace of V containing S. tt 


In particular, S is a set of generators of a vector space V over a field F if and only 
if every element of V is a linear combination of members of S. 


Example 1.3.16 ThesetE = {@;, @, ---, @,}, where 


— 
@ = (0,0,...,0, 1 ,0,...,0), 


is aset of generators of the vector space F”. Indeed, any memberx = (x1, %2,...,Xn) 
of F” is the linear combination ¥ = xe; + x2e€2 + --: + Xe, of members of 
E. The subset S = {@; + @3, @7 + @3, @3 + @} is also a set of generators of 
F?, for¥ = (x1,x2,%3) = ai(@ + @) + ar(@z + %) + a3(@3 + @%), where 


a, = Sau a = ee a3 = BORSA (verify). 


Example 1.3.17 Consider The subset S = {@; — @3, @7 — 63, &3 — @} of R’. 
It is easy to verify that x = (x1, x2, x3) is a linear combination of S = {e; — 
€2, €2 — €3, €3 — ej} if and only if x; + x2 ++3 = 0. Thus, the subspace < $ > 
of R? generated by S is the plane {x = (x1,x2,%3) | x1 +xX2+x3 = O}. 


Linear Independence 


Definition 1.3.18 A subset S of a vector space V over a field F is called linearly 
independent if given any finite subset {x,, x2,...X,} of S, x; A xj fori 4 j, 


AX, + ox. + +++ + a,X, = O implies that a; = 0 for alli. 


A subset S which is not linearly independent is called a linearly dependent subset. 


Thus, a subset S of a vector space V over a field F is linearly dependent if there is a 
subset {x , Xo, ...,X,} of distinct members of S, and a), a2, ..., G, not all zero in F 
such that 

AX, + 2X2 +--+ + anX, = O. 


Vacuously, the empty set Y is linearly independent. The observations in the following 
proposition are easy but crucial, and they will be used often. 


Proposition 1.3.19 Let V be a vector space over a field F. Then, 


(i) any subset of V containing 0 is linearly dependent, 
(ii) every subset of a linearly independent subset of V is linearly independent, 
(iii) every subset containing a linearly dependent set is linearly dependent, 
(iv) if S is a subset of V, andx €< S > — S, then S \){x} is linearly dependent, 
and 
(v) ifS is linearly independent, andx ¢< S >, then S {x} is linearly independent. 


1.3. Subspaces 15 


Proof (i) If0 € S, then 1-0 = 0 but 1 ¥ 0. It follows from the definition that S$ is 
linearly dependent. 

The assertions (i) and (iii) are immediate from the definition itself. 

(iv) Suppose that x ¢ S, and. x €< § >. Then there are distinct members x,, x2, ..., 
X, € S,and aj, a2, ..., a, € F such that 


X = AX, + 2X2 + +++ + AnXn- 


But, then 
1x + yx) + 2x2 + +++ + a,x, = 0. 


Since 1 4 0, it follows that S LJ {x} is linearly dependent. 
(v) Suppose that S is linearly independent, and x ¢< S$ >. Suppose that x;, x2, ..., 
Xn € S are distinct members of S (J {x} such that 

AX, + 2X2 +++ + a)X, = 0. 
If x; 4 x for all i, then since S is linearly independent, aj = 0 for all i. Suppose 
thatx; = x for some i. Without any loss, we may suppose thatx, = x.Thena,; = 0, 


otherwise, 


x = (—a1)"!ayx. + (—a)7!a3x3 +-+* + (—a1)7 nxn. 


belongs to < S >, a contradiction to the supposition. Thus, a, = 0. Hence 
2X2 + 43X3 + +++ + nX = O. 
Since S is linearly independent, a; = 0 for all i. tt 


Proposition 1.3.20 A subset S of a vector space V over a field F is linearly indepen- 
dent if and only if given distinct members x), X2, ..., Xn € S,and ay, Q2,...,dn, 1, 
bo, ..., bn) € F, 

AX, + 2X2 +++ + GnXp = Bix, + b2x2 + +++ + DpXn. 
implies that a; = b; for all i. 
Proof Suppose that S is linearly independent, and 

QX1 + AgX2 + +++ + AyXy = Dix + 9x2 +--+ + darn, 


where x;,.X%2, ..., X, are distinct members of $. Then 


(a — by)x, a (a2 — b2)x2 gee (an — bn) Xn = 0. 


16 1 Vector Spaces 


Since S is linearly independent, a; — b; = O for alli, and soa; = b; for all i. 
Conversely, suppose that the condition is satisfied, and 


AX) + 2X2 +++ + AnXy, = 0, 
where x}, .X%2, ..., X, are distinct members of S. Then, 


QyX, + AyX. Hore) + AyX, = Ox, + Oxy +--+ + OX. 


From the given condition aj = O for all i. This shows that S is linearly 
independent. tt 
Example 1.3.21 The set E = {@], @2, --- , @,} described in Example 1.3.16 is 
linearly independent subset of F”, for (x1, %2,...,X%n) = x1€1 + %2€2 +--+ + 
Xn€n = 0 = (0,0,...,0) implies that each x; = 0. Also, the subset S = {e@; + 
@, & + &, & + &} of Fis linearly independent, for a;(@€; + @2) + a2(@ + 


@) + a(@ + &) = 0 = (0,0,0) implies that a, +a; = 0 =a; +a 
a2 + a3. But, thena; = dy = a3 = 0. However, the subset S = {@; — @, @2 — 
€3, €3 — 2} of F? is linearly dependent, for 1(e; — @2) + 1(éx — @3) + 1(e3 — 
eq; = 0. 


1.4 Basis and Dimension 


Definition 1.4.1 A subset S of a vector space V over a field F is said to be a minimal 
set of generators or irreducible set of generators if 

(i) S generates V,i.e., << S >= V, and 

(11) no proper subset of S generates V. 

More precisely, << S>= V,and<S — {x}>AVforallx eS. 


Definition 1.4.2. A subset B of a vector space V over a field F is said to be a maximal 
linearly independent set if 

(i) B is linearly independent, and 

(ii) B C S implies that S is linearly dependent. 

More precisely, a linearly independent subset B is maximal linearly independent if 
for allx ¢ B, B\J{x} is linearly dependent. 


The following two propositions says that maximal linearly independent sets and 
minimal sets of generators are same. 


Proposition 1.4.3 Every minimal set of generators is also a maximal linearly inde- 
pendent set. 


Proof Let S bea minimal set of generators of a vector space V over a field F’. Suppose 
that S is not linearly independent. Then there exists a set {x,, x2, ...,Xy} of distinct 
members of S, and aj, do, ..., G, not all 0 in F such that 


1.4 Basis and Dimension 17 
AX, +2x%2 +--+ + a,xX, = 0. 


Since the addition is commutative, without any loss, we may assume that a, 4 0. 
But, then 


xy = (—a1)7!aax. + (—ay)!a3x3 tes + (—a1) 7 anxn- 


This shows that x; is a linear combination of members of S — {x,}, or equivalently, 
x; €< S— {x,} >. Thus, S C< S — {x;} >. Since < S > is the smallest subspace 
containing S$, V =< § >C< S$ — {x} >. It follows that < S — {x;} > = V. Thisis 
a contradiction to the supposition that S is a minimal set of generators of V. Thus, S 
is linearly independent. Next, suppose that x ¢ S. Since S is also a set of generators, 
it follows from the Proposition 1.3.19(iv) that S ){x} is linearly dependent. This 
completes the proof of the fact that S' is maximal linearly independent. tt 


Conversely, have the following proposition: 


Proposition 1.4.4 A maximal linearly independent subset is also a minimal set of 
generators. 


Proof Let B be a maximal linearly independent subset of a vector space V over a 
field F. Letx ¢ V.Ifx € B, thenx €< B >. Suppose that x ¢ B. Since B is maximal 
linearly independent subset of V, B J {x} is linearly dependent. Hence there exists a 
set {x1,.X2,..., Xn} of distinct members of B {x}, and aj, az,..., d, not all 0 in F 
such that 

AX + 2X2 + +++ + ayX, = 0. 


One of the x; is x and corresponding a; 4 0, otherwise B will turn out to be linearly 
dependent, a contradiction to the supposition that B is linearly independent. We may 
assume, without loss of generality, thatx; = x anda, 4 0. But, then 


x = xy = (—ay)"'anx + (—a1)~'a3x3 +--+ + (—41)QnXn. 


Hence x €< B >. This shows that B is a set of generators of V. Finally, x ¢< 
B — {x} >, otherwise, from Proposition 1.3.19(iv), B will turn out to be linearly 
dependent. This shows that B is a minimal set of generators. tt 


Most of the implications in the following theorem are already established. 


Theorem 1.4.5 Let B be a subset of a vector space V over a field F. Then the 
following conditions are equivalent: 

1. B is maximal linearly independent subset of V. 

2. Bis aminimal set of generators of V. 

3. B is linearly independent as well as a set of generators of V. 

4, Every nonzero element x € V can be expressed uniquely (upto order) as 


X = AX, + A2X. +++: + AyXn, 


18 1 Vector Spaces 


where X1,X2,...,Xn are distinct members of B, and aj, a2, ..., Q, are all nonzero 
members of F. 


Proof The equivalence of | and 2 follows from the Proposition 1.4.3 and the Propo- 
sition 1.4.4. The implication 2 = 3 follows from the Proposition 1.4.3. 

(3 = 4). Assume 3. Since B is a set of generators and also linearly independent, 4 
follows from the Proposition 1.3.20. 

(4 = 1). Assume 4. It follows again from the Proposition 1.3.20 that B is linearly 
independent. Suppose that x ¢ B. By (4), x is a linear combination of members 
of B, and so B | ){x} is linearly dependent. This shows that B is maximal linearly 
independent subset. ft 


Definition 1.4.6 A subset B of a vector space V over a field F is called a basis of 
V if it satisfies any one, and hence all, of the conditions in the Theorem 1.4.5. 


Example 1.4.7 The set E = {é;, @2, ---, @,} described in Example 1.3.16 is 
linearly independent (Example 1.3.21) subset as well as a set of generators of F” 
(Example 1.3.16), and hence it is a basis of F”. This basis is called the standard 
basis of F”. Similarly,S = {@; + @2, @€2 + @3, €3 + @} is another basis of F3. 


Proposition 1.4.8 Let V be a finitely generated vector space over a field F. Then V 
has a finite basis. Indeed, any finite set of generators contains a basis. 


Proof Let S be a finite set of generators of V. It may be a minimal set of generators 
and so a basis. If not, < S — {x1} > = V forsomex; € S.S — {x,} may be a minimal 
set of generators and so a basis. If not, then < S — {x,,x2.} > = V for some x2 € 
S — {x1}. S — {x1,x2} may be a minimal set of generators and so a basis. If not, 
proceed. This process stops after finitely many steps giving us a basis contained in 
S, for S is finite. tt 


Theorem 1.4.9 Let V be a finitely generated vector space over a field F. Then every 
basis of V is finite, and any two bases of V contain the same number of elements. 


Proof From the above proposition, V has a finite basis 
By = {x1,X2, -, Xn}(say). 


Let Bo be another basis of V. If B} — Bo = @, then B,; C Bo. Since B, and B> are 
both maximal linearly independent sets (being bases), B} = By, and we are done. 
Suppose that B} # B). Then B, — B, 4 U, otherwise Bz C B,, andagain By = By. 
Let y, € By — By. Since By, being a basis, is maximal linearly independent, B; J {y1} 
is linearly dependent. Thus, there exist a), dz, ..., @,, b; not all 0 in the field F such 
that 

AX, + agX. +--+ + a,X, + diy, = 0. 


1.4 Basis and Dimension 19 


Indeed, b; 4 0, otherwise 
AX, + 2X. +++ + ayXy = 0, 


and then all a; = 0. Further, since yj 4 0, biy; 4 0. Hence a; ~ 0 for some 7. We 
may assume that a; 4 0. But, then 


x1 = (—ay)~!anx. + (—ay)!a3x3 +++ + (-a1) px, + (—a1) "diy. 


Hence, x; €< (By — {x1}) Ufyi} >. This shows that By) C< (By — {x1}) Ufyi} >, 
and so (B; — {xi}) U{yi} generates V. We also show that (B; — {x1}) Ufyi} is 
linearly independent. Suppose that 


2X2 + 43X3 + +++ +4nXn + bry, = 0. 


If b,; = O, then 
2X2 + 43X3 + +++ + ayX,t+ = 0. 


Since B, is linearly independent, a; = 0 for all i > 2. Suppose that b; 4 0. Then 
yi = (—b1)"!anx2 + (—b)"'a3x3 ear (=b1) 7! anXn, 


and so y; €< B, — {x} >. Since (B; — {x:}) Ufyi} is already seen to be a set of 
generators of V, < B,; — {x,;} > = V. This is a contradiction to the supposition 
that B; is a basis (minimal set of generators). This shows that (By — {x1}) Uf} 
is also a basis containing n elements. If (By — {x1}) Uf} — Bo = JY, then as 
before (B; — {x1}) Ufyi} = Bo, and so B2 contains n elements. If not, as before, 
By — ((B — {x1 }) Uf} is nonempty, and then proceed as above. The process 
stops after finitely many steps, at most at the nth step, showing that B, is finite, and 
contains exactly n elements. ft 


Definition 1.4.10 The number of elements in a basis of a finitely generated vector 
space V over a field F is called the dimension of V, and it is denoted by dim(V). 


It follows from Example 1.4.7 that the dimension of F” is n. The dimension of the 
planeW = {® = (%1,%0,%3) | x1 +242 +43 = O} is 2, for {ey — @7, @ — @} 
is a basis of W (verify). To determine the dimension of a vector space, one needs to 
determine a basis of the vector space, and then count the number of elements in the 
basis. In the next chapter we shall have an algorithm to find a basis, and so also the 
dimensions of the subspaces of F”, which are generated by finite sets of elements 
of F”. 


Proposition 1.4.11 Every set of generators of a finite dimensional vector space 
contains a basis. 


20 1 Vector Spaces 


Proof LetB = {x,,%2,...,X,} be a finite basis of a finite dimensional vector space 
V over a field F. Let S' be a set of generators of V. Then each member x; is a 
linear combination of a finite subset A; (say) of S. In turn, each member of B is a 


linear combination of the finite subset A = A; JA2U, ..., UAn of S. Since B 
generates V, A also generates V. Since A is finite, we can reduce A to a minimal set 
of generators, a basis of V. tt 


Proposition 1.4.12 Every linearly independent subset of a finite dimensional vector 
space V over a field F can be enlarged to a basis of V. 


Proof LetB = {x,,%2,...,X,} be a finite basis of a finite dimensional vector space 
V over a field F, and S a linearly independent subset of V. If B C< S >, then < $ > 
= V,and so Sis abasis, and there is nothing to do. If not, then some x; € B— < S$ >. 
We may assume that x, € B— < S >. Then by the Proposition 1.3.19(v), S U {x} is 
linearly independent. If B C< SU{x1} >, then V =< B>C< SU{x1} >, and so 
S {x1} turns out to be a basis. If not, proceed. This process stops at most at the nth 
step enlarging S to a basis. tt 


Corollary 1.4.13 [fdimV = n, then 


(i) every linearly independent subset contains at most n elements, and 
(ii) any set of generators contain at least n elements. tt 


Proposition 1.4.14 Let F be a finite field containing q elements, and V a vector 
space over F of dimension n. Then V contains exactly q" elements. 


Proof Since dimV = n, there isa basis {x;,x2,...,X,} of V containing n elements. 
Hence every element v of V can be expressed uniquely as 


Vi = AX, + 2X2 +++ + AnXn, 


where a1, Q2,..., @ belong to F. This says that we have a bijective map 7) from F” 
to V defined by 


n(Q1, AQ, .665 Qn) = yxy + 2X2 + +++ + AnXn. 


Since F contains g elements, F” and hence V contains g” elements. tt 


Corollary 1.4.15 Let F be a finite field of characteristic p (note that p is prime). 
Then F contains p" elements for some n € N. 


Proof Since F is finite, its characteristic is some prime p # 0. By proposition1.1.1, 
F has a subfield isomorphic to the field Z, of prime residue classes modulo p. Thus, 
F is a vector space over a field containing p elements. Since it is finite, its dimension 
is finite n (say). From the previous proposition, the result follows. tt 


1.4 Basis and Dimension 21 


Corollary 1.4.16 Let L be a field containing p" elements, where p is a prime, and 
n> 1. Let F be a subfield of L. Then F contains p” elements for some divisor m of n. 


Proof Since F is a subfield of L, charL = charF = p. Thus, F contains p” 
elements for some m. Since L is a vector space over F, it follows that L contains 
(p”)’ = p’” elements for some r. Hence n = mr. ft 


Remark 1.4.17 We shall see in a latter chapter that for every prime p, and for all 
n > 1, there is a unique (upto isomorphism) field of order p”. Further, corresponding 
to any divisor m of n, there is a unique subfield of order p”. 


Definition 1.4.18 Let V be a vector space over a field F. An ordered n-tuple 
(x1, X2,...,X,) is called an ordered basis of V if the set {x,, x2,...,x,} is a basis 
of V. Thus, to every basis there are exactly n! distinct ordered bases which give rise 
to the same basis. 


Proposition 1.4.19 Let V be a vector space of dimension n over a finite field F 
containing q elements. Then the number of ordered bases of V is 


(q" — 1)(q" — q)(q" -—@)---(q" - 4"), 


and the number of bases of V is 


q@’-)N@-@::-@¢@- gq’) 


n! 
Proof We find the number of ordered n-tuples (x1, x2,...,X,) such that the set 
{X1,X2,..-+,Xn} is a basis. Since x, can be any nonzero element of the vector space, 


the number of ways in which x; can be selected is g” — 1. Having chosen x;, we 
have to select x2 such that {x;, x2} is linearly independent. Clearly, {x,, x2} is linearly 
independent if and only if x. 4 ax, forall a € F. Thus, the number of ways in which 
the ordered pair (x;, x2) can be chosen so that the set {x;, x2} is linearly indepen- 
dent is (g” — 1)(q” — q). Again, having chosen (x, x2), we have to find x3 so that 
{x1, X2, X3} is linearly independent. Now, {x1, x2, x3} is linearly independent if and 
only if x3 A a,x; + a2Xx2 for every pair a), a2 € F. Thus, the number of choices for 
x3 is gq” — g. Hence the number of choices for the ordered triple (x1, x2, x3) so that 
the set {x,, x2, x3} is linearly independent is (g” — 1)(g" — q)(q" — q’). Proceeding 
inductively, the number of choices for the ordered n-tuple (x), x2, ...X,) so that that 
{X1,X2,..-X,} is linearly independent (and so a basis) is 


(q" - Iq" - gq" -@)---@" - 4"). 
In turn, the number of bases of V is 


(q" — 1)(q" — q)---(q" -— @""") t 
n! , 


22 1 Vector Spaces 


Remark 1.4.20 If W is a subspace of a vector space V, then since a basis of W is a 
linearly independent subset of V, dimW < dimV. Further, if 1 is the dimension of 
V, and m < n, then there is a subspace of dimension m, for if (x1, 2%2,...,%,} is a 
basis of V, then the subspace < {x1,%2,...,%m} > is a subspace of dimension m. 


Proposition 1.4.21 Let V be a vector space of dimension n over a field F containing 


q elements. Then the number b, of r dimensional subspaces, r = | is 


sia a ao) 
" @-Y)@-g--@-q7) 


The total number of subspaces is 
1 + by + bo se aes bn, 


where b, is given above. 


Proof Let0 < r < _n. Any subspace of dimension r is determined by a linearly 
independent subset {x), x2,..., x,} of V. From the proof of the above proposition, 
it follows that the number of linearly independent subsets containing r elements is 


(q" — 1)(q" -—q)---(q" -q""') 


r! 
Further, a linearly independent subset {y;, yo, ..., y,} determine the same subspace 
as the set {x1, %2,...,x,} if and only if {y1, y2,..., y-} is a basis of the the subspace 
< {X1,%2,...,%,} > which is of dimension r. The number of bases of a vector space 
of dimension r is 
@-)@-o--G-q"') 
r! 


Thus, the number of r dimensional subspaces of V is b, given in the statement of the 
proposition. ft 


Proposition 1.4.22 Let V be a vector space of finite dimension over a field F. Let 
W, and W2 be subspaces of V. ThenW,; + Wz = {x+y|xEW,, ye Wa}isa 
subspace, and 


dim(W, + W2) = dimW, + dimW, — dim(W, { ) W.). 
Proof We have already seen that W; + Wp) is a subspace. Let {x1,x2,...,x;} bea 
basis of W, () W2. Then it is a linearly independent subset of W, as well as of W>. 


Thus, it can be enlarged to a basis 


{x1, X2, weeyXry V1, V2; cans Vin} 


of W,, and to a basis 


1.4 Basis and Dimension 23 


{X15 HO) 955 MP2s Zaps Sah 


of W>, where dimW,; = r+m,anddimW, = r-+n. We show that 


S = {Kp Xa, Ky Vs Vs oo  Vony 215 22s 0+ oy Zn 


isabasisof W; + W2.Clearly,W,; C< S >,and W2 C< S >.Hence W; + W2 C< 
S >.AlsoS C W, LU) Wo. Hence, < S >C W, + Wo. Now, we show that Sis linearly 
independent. Suppose that 


ayx, + ax +--+ + apxy + Biyr + Gaye +--+ + BnYm + O1Z1 + 6222 + +++ + OnZn = 0. 


Then, 
OX et  OpXp + By, treet BnYm — —0121 — +++ OnZn 


belongs to W; () Wo. Since {x1, x2, ...,X;} is a basis of W; (| Wo, 


6121 — 0022 — +++ — OnZn = YidX1 + Y2X2 + WeXr 
for some 71, 72,---, 7, in F. Since {x1,X0,...,X;, Z1, Z2,-+++5 Zn} is linearly inde- 
pendent (being a basis of W2), it follows that 6), 62,...,6, are all zero. Simi- 
larly, G), 2,..- Gm are all zero. But, then ajx; + agx. +---+ a,x, = 0. Since 
{x1,%2,...,%,} is linearly independent(being a basis of W,; () W2), we see that 
Q4, Q2,..., a, are all zero. This shows that S is also linearly independent and so 
a basis of W; + Wo. In turn, dim(W; + W2) = r+m+n = (r+n4+(rt+ 
m)—r = dimW, + dimW, — dim(W, () W2). tt 


Example 1.4.23 Let V be a vector space of dimension n. Let W; and W2 be distinct 
subspaces of dimension n — | each. Then W; + Wy) is a subspace of V containing 
W, properly, and so it is V. Thus, dim(W; + W2) = n = dimW, + dimWz — 
dim(W, (| W2. Hence, from the above proposition dim(W,; (1) W2) = n—1+n— 
l-n = n-2. 


1.5 Direct Sum of Vector Spaces, Quotient of a Vector Space 


Let Vi, V2,..., V; be vector spaces over a field F. Consider the Cartesian product 
V = V, x V2 x --- x V,. Define addition using the coordinate-wise addition, and 
the external multiplication - by a- (x1,%2,...,X-) = (x1, 0%,...,ax;). It is 
straight forward to verify that V is a vector space over F with respect to these 
operations. This vector space is called the external direct sum of V|, V2,..., V,- 

A vector space V over a field F is said to be internal direct sum of its subspaces 
Vi, Vo,..., V, if every element x of V has a unique representation as 


x= xX txt: + x, 


24 1 Vector Spaces 
where x; € V; for all i. The notation 
V=V,0V@-::: @ V, 


stands to assert that V is direct sum of its subspaces Vi, Vo,..., V;. 


Proposition 1.5.1 Let V;, V2,..., V; be subspaces of a vector space V over a field 
F. Then the following conditions are equivalent. 


d)V=V,0@VW@.-:: @ V,. . 
(2) V = Vi + Vo + --- + V,, and V;(\V' = {0} for all i, where V' = 
(Vi + V2 +--+ + Vin + Vigan + Vino + +++ + V,). 


Proof 1 => 2. Assume 1. Since every element x of V has a unique representation 
asx = x, +%2.+---+x,,V = Vi + Vo +--+ + V;,. Letx € Vf) V'. Then 
X = Xp XQ +++ + XH + X41 + Xi42 +--+ xX;-, where x;, 7 A i belong to V;. Thus, 


O = xy tx. +--+ +41) —X +X, +X. +-°- +x, = OF04+---40. 
From the uniqueness of the representation of 0, it follows that x; is zero for all j, and 
x is also 0. Hencex = 0. 

2 = > |. Assume 2. Clearly, every element x of V has a representation x = 


XxX} +X. +---+x,, where x; € V; for all i. Now, we prove the uniqueness of the 
representation. If 


XS XX HX = Miya + yr, 


where x;, y; € V; for all i. Then x; — y; € V;{) Vi= {0}. This shows that x; = 


y; for all i. tt 
Remark 1.5.2 If V is direct sum of V;, V2,..., V;, then V! as defined in the above 
proposition is direct sum of V;, V2,..., Vi-1, Visi, Vie2,---, Vee 


Proposition 1.5.3. Let W,, W2,..., W, be subspaces of a vector space V such that 
V = Wi+W2+---+ W,. Then V is direct sum of W,, W2,..., W, if and only if 
dimV = dimW, + dimW2 +---+ dimwW,. 


Proof Suppose that V is direct sum of W,, W2,...,W,. Then, we show that 
dimV = dimW, + dimW2 + ---+dimW,. The proof is by induction onr.Ifr = 1, 
then there is nothing to do. Assume that the result is true for r. Since W; (.) Wi= {0}, 
it follows from the Proposition 1.4.22 that dimV = dim(W,+W!') = dimW, + 
dimW! — dim{0} = dimW, + dimW». Further, it is clear that W! is direct sum 
of W2, W3,..., W,+1, and so by the induction assumption dimw! = dimW> + 
dimW3 +---+dimW,+,. Hence dimV = dimW, + dimW2+.---+dimW,. Con- 
versely, suppose that dimV = dimW,+dimW2+.---+dimW,. Clearly, then 
Wi) W' = {0} for all i. The result follows. tt 


1.5 Direct Sum of Vector Spaces, Quotient of a Vector Space 25 


Example 1.5.4 The plane W = {(x,y,z)|&+my+nz = O}, d,m,n)F 
(0, 0, 0), is a subspace of IR? of dimension 2 (verify). Let (a,b,c) ¢ W. The line 
L = {(aa, ba, ca) | a € R} is also a subspace of R? of dimension | such that 
W+L = R’. Since dimW + dimL = 3,R? = W@L. 


Let V be a vector space, and W asubspace of V.Letx € V. The subsetx + W = 
{x + w | w € WhHis called the coset of V modulo W determined by x. This is also 
called a Plane in V passing through x and parallel to W. The set {x + W | x € V} 
of cosets of V modulo W is denoted by V/W. 


Proposition 1.5.5 Let W be a subspace of a vector space V over a field F. Then the 
following hold: 


(i) xex + Wforallxe V. 
(ii) (x + W = y + W)if and only if (x — y) EW. 
(iii) (x + W Ay + W)if and only if (x + W) 0 + W) = &. 


In particular, V/W is a partition of the vector space V. 


Proof (i) SinceeOQ EW, x =x+ 0 €x + Wforallxe V. 

(ii) Suppose that (x + W = y + W). Thenxey + W.Hencex = y + w 
for some w € W. In turn, x — y = w belongs to W. Conversely, suppose that 
x — ye W.Thenx + w = y + (x—y) + wbelongstoy + W forallw € W. 
This shows thatx + W Cy + W. Similarly, since y — x also belongs to W, it 
follows thaty + WCx-+ W.Thus,x + W = y+ W. 

(iti) Suppose that (« + W)(\(@y + W) #G.Letze & + W)(\W + W). Then 
z=x+u= y4 vforsomeu,v € W. But, thenx — y = v — u belongs to 
W. It follows from (ii) that(x + W = y+ W). tt 


Corollary 1.5.6 Let W be a subspace of a vector space V over a field F. Then the 
following hold: 


(i) Ifx + W =x + Wandy + W=y + W, then(x + y) + W = 
Gy) sb Ww. 
(ii) Ifx + W = x 4+ W, thenax + W = ax’ + Woorallae F. 


In turn, we have an internal binary operation + on V/W, and an external multipli- 
cation - on V/W by scalars given by 


~+W)+o0+W)=@+y) 4+ W, 


and 
a-(x + W) =a-x + W. 


Further, V/W is a vector spae with respect to these operations. 


Proof (i) Suppose thatx + W = x’ + W,andy + W = y’ + W.Thenx — x’ 
and y — y’ belong to W. Since W is a subspace (x + y) — (x + y’) belongs to 


26 1 Vector Spaces 


W. This shows that (x + y) + W = (@ + y) + W. 

(ii) Suppose thatx + W = x’ + W.Then,x — x’ € W. Since W is a subspace 
(ax — ax’) = a(x — x’) belongs to W. This shows that ax + W = ax’ + W 
for alla e F. 

(i) and (ii) ensure that we have the addition + on V/W, and the multiplication - 
by scalars as described in the corollary. The verification of the fact that V/W is a 
vector space with respect to the operations described is straight forward. The zero 
of the vector space is the coset0 + W = W, and the negative of of x + W is 
—x + W. tt 
Definition 1.5.7 The vector space V/W described above is called the 

quotient space of V modulo W. 


Proposition 1.5.8 Let V be a finite dimensional vector space over a field F, and W 
a subspace of V. Then 
dinV/W = dimV — dimW 


Proof Let {x,,%2,...,x,} be a basis of W, and {y; + W, yo+W, ..., ys + W} 
be a basis of V/W. We show that {x1, x2, ...,X;-, V1, Y2,--+, Ys} is a basis of V. Let 
x eV. Then x+ W € V/W. Since {y; + W, yo+ W, ..., ys + W} is a basis of 
V/W, 

x+We=ai(yi + W)+02(2 + W) + +++ + ass + W) = (iyi + Q2y2 +--+ + sys) + W 
for some a}, Q2,..., as in F. Hence, 


x — (aiyi + aye + +++ + sys) 
belongs to W for some a), Q2,..., Hs in F. Since {x,, %2,...,x,} is a basis of W, 
X— (yyy + Q2y2 + +++ + Asys) = Byxy + Box. + +++ + G,x; 
for some (3, 32,..., @, in F. Thus, 
X = yx + Pox +--+ + BX, + anys + Aaya +--+ + Asys 


for some (3;, a; € F. This shows that {x,,x2,...,%r, V1, 2,---,Ys} generates V. 
Next, suppose that 


Bix, + Box. + +++ + Bx, + ay + A2y2 +-++ + Asyy = 0. 


Since 3)x; + (2x2 +--+ + G,x, belongs to W, 


aii + W) + a2Q2 + W) +--+ + ass + W) = (airy) +a2y2 +--+ + asys) +W = W. 


Since W is the zero of V/W, and {y; + W, y2 + W,--- , ys + W} (being a basis 
of V/W) is linearly independent, a; = 0 for all j. But, then @,x, + Box. +--+ + 


1.5 Direct Sum of Vector Spaces, Quotient of a Vector Space 27 


Bx, = 0. Since {x;, x2,...,x,-} (being a basis of W) is linearly independent, @; = 
0 for all i. This shows that {x;, x2, ...,X;-, ¥1, Y2,--+,5 Ys} is linearly independent, and 
so it is a basis also. Thus, 


dimV = r+ s = dimW + dimV/W. o 


In this book, we shall be mainly interested in finite dimensional vector spaces. How- 
ever, a vector space need not be finite dimensional, and to develop the analogues 
theory for infinitely generated vector spaces, one uses some equivalents of axiom of 
choice (for example, Zornes Lemma). 


Proposition 1.5.9 Union of a chain of linearly independent subsets is linearly inde- 
pendent. 


Proof Let {Sq | a@€ A} be a chain of linearly independent subsets. Then 0 ¢ 
Sq for alla,andhence0 ¢ U,¢, Sa. Letx1, x2, ...,X, be distinct elements U,-, So. 
Suppose that x; € Sy,;. Since {S_ | a@«€ A} is a chain, there exists a, such that 
So; © So, for all i. Thus x1, x2,...,X, all belong to Sy,. Since S,, is linearly inde- 
pendent, 


AX, + 2X2 + +++ + a,X, = O implies that a; = 0 for alli. 


This shows that the union is linearly independent. ft 


Proposition 1.5.10 Every linearly independent subset can be embedded in to a 
basis. 


Proof Allthat we need to show that every linearly independent subset can be embed- 
ded in a maximal linearly independent subset. Let S' be a linearly independent subset 
of a vector space V over a field F. Let X be the set of all linearly independent sub- 
sets which contain S. Then X 4 J, for S € X. Thus, (X, C) is a nonempty partially 
ordered set. From the previous proposition, it follows that every chain in X has an 
upper bound. By the Zorn’s Lemma, X has a maximal element T (say). Clearly, T is 
also maximal linearly independent subset. ft 


Proposition 1.5.11 Every set S of generators contains a basis. 


Proof Let S be a set of generators of V. If S = {0}, or S = @, then V = {0}, 
and then # C S is a basis of V. Suppose that § 4 {0}. Let X be the set of all linearly 
independent subsets of S. If x is a nonzero element in S, then {x} € X, andsoX 4 @. 
Thus, (X, C) is a nonempty partially ordered set. Since union of a chain of linearly 
independent subsets is a linearly independent subset, every chain in X has an upper 
bound. By the Zorn’s lemma, X has a maximal element B (say). S C< B >, for if not, 
there exists an element x € S— < B >. Consider the subset B’ = B \)J{x}. Suppose 
that 
ax + ayxy + aox. + ++ + AyxX, = 0, 


28 1 Vector Spaces 


where x), X2,...x, are all in B, and x; ¢ x; for i Aj. But, then ax €< B >. Since 
x¢<B>, a = 0. Since B is linearly independent, a; = 0 for all i. This shows 
that B’ is linearly independent. This is a contradiction to the supposition that B is 
maximal linearly independent subset of S. Thus, S C< B >, andso<B>= V. 
This shows that B is a basis of V contained in S. t 


Exercises 


1.5.1 Test the following for being a subspace of F?. Find the subspace generated by 
those which are not subspaces. 


G) W = (% = (1, x2, x3) | x1 + x2 = x3}. 
(ii) W = {% = (x1, x2, .%3) | x1 + 2x2 +23 = 0 = 2x) + x2 + 3x3}. 
(ii) W = {% = (4, X%, 3) | xy + 2%. + x3 = I}. 
(iv) W = {X = (x1, x2, x3) | xy + 2x. = ae 
(v) W = {x = (44, X%2, x3) xt +5 +x3 = 1}. 
(vi) W = {(x, sinx, cosx) | x € R}. 
(v) W = {x = (44, X2, x3) e+xe4+xe=1 = x, + 2x + x3}. 


1.5.2 Which of the following subsets are linearly independent? and Why? 


(i) The subset {(1, 1, 0), (0, 1, 1), (1, 0, 1)} of F?. 
(ii) The subset {(1, 1, 1, 0), (2,3, 4, 0), (4, 9, 16, 0), (2, 3, 7, 0)} of F*. 
(iii) The subset {(1, 1, 1, 0), (1, 3, 4, 0), (1, 9, 16, 0), } of F*. 
(iv) The sphere S? = {(x, y, z)|x? +y +2 =1}inF?. 
(v) The subset {(x, x”)|x € R} of R?. 


1.5.3 Let V be a finite dimensional vector space, and W a subspace such that 
dinW = dimV. Show that W = V. Is this result true for infinite dimensional 
vector spaces? Support. 


1.5.4 Show that a non trivial proper subspace of R? is either a line passing through 
origin or a plane passing through origin. 


1.5.5 Show that the intersection of two distinct planes passing through origin is a 
line passing through origin. 


1.5.6 Let W; and W, be two distinct subspaces of dimension n— | of a vector space 
W of dimension n. Show that the dimension of W; (| W2 is n—2. 


1.5.7 A subspace W of dimension n — | of a vector space V of dimension n is 
called a hyperplane in V. Show that for every hyperplane W of dimension n — | of 
the vector space F”, there existsG@ = (a), do,...,4y)) € F" — {0} such that 


We {x = (1, X2,---Xn) | 1X) + doxX2 + +++ + anXy, = O}. 


1.5.8 Show that a subspace (also called a plane) of dimension r of a vector space V 
of dimension n is an intersection of n — r distinct hyperplanes. 


1.5 Direct Sum of Vector Spaces, Quotient of a Vector Space 29 


1.5.9: What are the results in the above section which remain true even when a field 
is replaced by the set Z of integers with usual addition and multiplication? What are 
the best possible modifications in the results which are not true for Z so that it holds 
for Z also? 


1.5.10 Show that {e, + e2, eo + €3,..., €n—1 + €n, en}, where {e), e2,..., en} is the 
standard basis of the vector space F”, is another basis of F”. 


1.5.11 Characterize a vector space with unique basis. Can we have a vector space 
with exactly 2 bases? Support. 


1.5.12 Find the number of bases of Z; over Zp. 
1.5.13 Find the number of subspaces of Z; over Zp. 


1.5.14 Show that a vector space V has no proper subspace if and only if it is of 
dimension 1. 


1.5.15 Embed (1, 1, 0, 2) into a basis of R*. 


1.5.16 Show that {(1, 2, 1), (3, 1, 2), (1, 1, 1)} forms a basis of R? over R. Express 
(3, 5, 2) as linear combination of members of the above basis. 


1.5.17 Can we have a nontrivial finite vector space over a field of characteristic 0, 
or over any infinite field? Support. 


1.5.18 Let F be a field of order 32. Find the number of subfields of F. 


1.5.19 Let P,,(F) denote the set of polynomials of degree at most n over the field 
F. Show that P,,(F’) is a vector space over F' with respect to the usual addition of 
polynomials and multiplication by scalars. Find a basis of this vector space and so 
also the dimension. 


1.5.20 Let W, be a subspace of dimension n — | of a vector space V of dimension 
n. Let W2 be a subspace of dimension r such that W2 is not contained in W,. Find 
the dimension of W; () W2. 


1.5.21 Let V be a vector space over infinite field. Can we express V as finite union 
of proper subspaces? Support. 


1.5.22 Show that {1, (X — 1), (X — 1)”,..., (X — 1)"} isa basis of the vector space 
P,,(F) of polynomials of degree at most n. 


1.5.23 LetW = {(%],%2,...,%) € R” | x) + 2x2 + 3x3 +---+nx, = 0} Show 
that W is a subspace of IR”. Find the dimension of W. 


1.5.24 Find a vector space having exactly 3 bases. Is such a vector space unique? 
Support. 


30 1 Vector Spaces 


1.5.25 Let V be a vector space over a field F and W a subspace. Show that there is 
a subspace W’ such thatV = W@W’. 


1.5.26 Let W be a nontrivial proper subspace of the Euclidean vector space R? of 
dimension 3. Show that W is a line passing through origin, or it is a plane passing 
through origin. 


1.5.27 Let W, be a line passing through origin, and W) a plane passing through 
origin which does not contain the line W;. Show that R? = W,@ Wy». 


1.5.28 Let (/, m,n) 4 (0,0, 0). Show that W = {(Al, Am, An) | A € R} is a sub- 
space of R?, and the quotient space R?/ W is the set of all lines having the direction 
ratio (1, m,n). 


1.5.29 Show that R considered as a vector space over the field Q of rational numbers 
is infinite dimensional. 


1.5.30 Show that in the real vector space C[0, 1] of all real valued continuous func- 
tions on [0, 1], the set {Sin x, Sin 2x, ..., Sin nx} is linearly independent subset for 
all n > 1. Show also that the set of all Legendre polynomials also form a linearly 
independent set. Deduce that C[O, 1] is infinite dimensional. 


Chapter 2 
Matrices and Linear Equations 


Matrices play a pivotal role in mathematics, and in turn, in all branches of science, 
social science, and engineering. This chapter is devoted to the interplay between 
matrices and systems of linear equations. 


2.1 Matrices and Their Algebra 


By definition, am x n matrix A with entries in a field F is an arrangement 


@11 412 °°** Gin 

421 422 +** An 
A= 

QAm1 Am2** * Amn 


of m rows and n columns of elements of F’. In short A is denoted by [aj], where aj 
is the entry in the ith row and jth column of A. The ith row 


(dj1, 4j2,..-, Gin) 


of the matrix A is a vector in F'”, called the ith row vector of A, and it will be denoted 
by R;(A). Thus, the matrix A can also be expressed as a column 


© Springer Nature Singapore Pte Ltd. 2017 31 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1 -10-4256-0_2 


32 2 Matrices and Linear Equations 


R,(A) 
Ro(A) 


Rn (A) 


of m rows with entries in F’”. 
Similarly, if we treat the members of F” as column vectors, then the jth column 


aj 
42; 


amnj 


of the matrix A is a column vector in F’”, called the jth column vector of A, and it 
will be denoted by C;(A). As such, the matrix A can also be expressed as a row 


A = [Ci(A), C2(A), ... , Cn(A)]. 


Thus, 


is a4 x 5 matrix with entries in the field C of complex numbers. 
A matrix A is called a square matrix if the number of rows and columns are same. 


The matrix 
201 


A=1|1410 
081 


is a square 3 x 3 matrix with entries in the field R of real numbers. 
The set of all m x n matrices with entries in a field F is denoted by M,,,,(F). The set 


of all square n x n matrices is denoted by M,,(F’). We have a binary operation + on 
Minn(F), called the matrix addition, and which is defined by 


[ae] e (bg) = Teal, 


where Cij => aij + bj. 


2.1 Matrices and Their Algebra 33 


For example, 


201 012 213 
410} 4+ ]},310} =|]720 
0381 581 5 162 


The m x n matrix On.» all of whose entries are 0 is called the zero m x n matrix. 
Clearly, the matrix 0,.x, is described by the property that for any m x n matrix 
A, A + Onxn = A = Onxn + A. Further, if A = [a,j] is am x n matrix, then 
the matrix —A = [—a;j] all of whose entries are the negatives of the corresponding 
entries of A is called the negative of A, and it is described by the property that 
A + (—A) = Omxn = (—A) + A. The proof of the following proposition is an 
immediate consequence of the corresponding properties of the addition + in F. 


Proposition 2.1.1 The set Mnn(F) ofm x n matrices with entries in F is an abelian 
group with respect to the matrix addition in the sense that it satisfies the following 
properties: 

(i) The matrix addition + is associative in the sense that 
(A+B)+C=A+(B86+0 

forall A, Band C in Myy,(F). 

(ii) The matrix addition + is commutative in the sense that 

(A + B)= (B + A) 

forall A, Bin Mny(F). 

(iii) There is aunique matrix Omxn iN Minn (F) such thatA + Omxn = A = Onmxn + A 
for all A in Mny(F). 

(iv) For each matrix A in My,(F), there is a unique matrix —A in Miyy,(F) such that 


A + (—A) = Om = (-A) + A. ft 
We have an external multiplication - on Mj,(F) by scalars in F' defined by a- 
aj] = [by], where bj = a-aj. Thus, for example, 
201 402 
2.1410} =|820 
081 0 162 


It can be further observed that the triple (Minn(F), +, -) is a vector space over F. 
Indeed, (Minn(F), +, -) can be identified with the triple (F””, +, -) under the corre- 
spondence A <—> (Rj(A), Ro(A), ..., Rm(A)) which respects all the operations. 
Let e;; denote the matrix in which ith row jth column entry is | and the rest of the 
entries are 0. For example, the 3 x 3 matrix e3 is given by 


000 
e233 = 001 
000 


It follows that the set {ej | 1 <i<m, 1 <j < n} corresponds to the standard basis 
of F’”"” under the above correspondence. Clearly, 


34 2 Matrices and Linear Equations 
lay] = Li jayeij, 


and 
Lj jAijeij = Omn 


if and only if aj, = 0 for alli, j. Thus, {ej | 1 <i< m,1<j <n} isa basis, called 
the standard basis, of the vector space M,,,,(F). Thus, the dimension of My,(F) is 
m-n. In particular, M,,(F) is of dimension n. 

Apart from the above operations, we have an external operation - from My,(F) x 
Mnp(F) to Minny (F), called the matrix multiplication, defined as follows: Let A 
lajl,l1<i<m,1<j<n,andB = [bx], l<j<n,l<k<p. Then A-B = 


[ci], where cx, = Ljajjbj.. Thus, for example, 
201 012 5 105 
410] -]|310)] = 3 5 8 
081 581 29 16 1 


It can be observed easily that the matrix multiplication is distributive over addition 
from left as well as from right in the sense that (A + B)-C = A-C 4+ B-C 
andA-(B + C) = A-B + A-C. Evidently, A -Onxp = Onxp, and Opxm-A = 
Onxn. Again, since Xz(Ljaybj)eg = UjayCUgbjecy), it follows that the matrix 
multiplication is associative in the sense that (A - B)- C = A-(B-C) whenever the 
products are defined. In particular, we have a multiplication - in M,,(F). Note that 
matrix multiplication is not commutative, for example, 


[oo] [10] = [oo): 
[to] [oo] = [or 


Thus, the set M,,(F) of n x n matrices with entries in F together with matrix 


addition +, the multiplication by scalars, and the matrix multiplication - is an algebra 
in the sense of the following definition. 


where as 


Definition 2.1.2 A vector space V over a field F together with an internal multipli- 
cation - on V is called an algebra over F if the following conditions hold: 


1. The internal multiplication - is associative, ie., (x-y)-z = x-(y- Zz) for all 
x,y,zEV. 

2. - distributes over addition +, i.e, (x+y)-z = x-z + y-z, andalsox- (y+ 
z) = x-y+x-zforallx,y,zeV. 

3. ax-y) = (ax)-y = x- (ay) foralla € F,andx,ye V. 


Let A bean x m matrix. Them x n matrix A’ obtained by interchanging rows and 
columns of A is called the transpose of A. More precisely, if A = [aj]isan xm 


2.1 Matrices and Their Algebra 35 


matrix, then the m x n matrix A! = [bj], where bj; = aj is called the transpose of 
A.LetA = [aj] be an x m matrix with entries in the field C of complex numbers. 
The matrix A = [by], where bj = aj (the complex conjugate of aj) is called the 


conjugate of the matrix A. The matrix A* = A’ is called the tranjugate, also called 
the hermitian conjugate of A 


Thus, for example 
t 


201 240 
410| =]018 
081 101 
2+iilt+i|* P—74—7 144 
ALTi 0 =| — <7 § 
LoS 143 lef - TS7 


Proposition 2.1.3. Let A, B be matrices with entries in a field F. Then 


(i) (A+ BY! =A' +B 
(ii) (A)! = A. 

(iii) (a- A)’ = a-A' 

(iv) (A-B)' = B'-A' 


provided the relevant sums and the products are defined. 
Further, if A, B are matrices with entries in the field C of complex numbers, then 


(v) (A + B)* = A* + B 
(vi) (A*t)* = A. 

(vii) (a-A)* = G-A* 
(viii) (A. B)* = B*-A* 


provided the relevant sums and the products are defined. 


Proof The identities (i), (11), and (iii) are evident from the definition. We prove the 
(iv). Suppose thatA = [aj]isan x mmatrix,andB = [bx]isam x p matrix. Then, 
by the definition, A-B = [cx], where cy = Ljaybj = Ujvjjuj where yg = dy 
and uj; = aj. By the definition B’ = [vy], A’ = [uj], and (A- B)’ = [w,;], where 
Wii = Cie. This shows that the k,, row j,, column entry of both sides are same. This 
proves the result. The proofs of the rest of the identities are similar. ft 


2.2 Types of Matrices 


1. Identity matrix. The n x n matrix all of whose diagonal entries are | and off 
diagonal entries are 0 is called the identity matrix of order n, and it is denoted by /,,. 
For example, 


36 2 Matrices and Linear Equations 


100 
&= 010 
001 


It can be checked that J,-A = A = A-I,, for every n x m matrix A. Indeed, if C 
is an X m matrix such that C-A = A for every n x m matrix A, thenC = [,. 
2. Diagonal matrix. A matrix A = [a,j] is called a diagonal matrix if all off diagonal 


entries are 0. Thus, [a,j] is a diagonal matrix if aj; = 0 for all i 4 j. The diagonal 
matrix whose ith row ith column entry is a; is denoted by Diag(a1, Q2,..., Q,). For 
example, 
100 
Diag(, 2, 3) = | 020 
003 
The effect of multiplying the diagonal matrix diag(a1, a2, ..., @,) toan x mmatrix 
A from left is to multiply the ith row by a;. Thus diag(aj,a2,...,,) + [aj] = 
[by], where bj = ajaj. Similarly, the effect of multiplying this matrix toam xn 


matrix A from right is the same as multiplying the ith column by q;. In particular, 
diag(a, 02, ..., An) + diag(B1, G2,... Bn) = diag(o1B1, a2, ..., AnGn). 

3. Scalar matrix. A n x n diagonal matrix all of whose diagonal entries are same 
is called a scalar matrix. Thus, a scalar matrix is of the form aJ,, and effect of 
multiplying this matrix to a matrix A is aA. 

4, Symmetric matrix. A matrix A is called a symmetric matrix if A‘ = A. Thus, 
a diagonal matrix is a symmetric matrix. The matrix 


132 
320 
203 


is asymmetric matrix. It follows from the Proposition 2.1.3 that sum of two symmetric 
matrices are symmetric, scalar multiple of a symmetric matrix is a symmetric matrix. 
Thus, the set S,,(F’) of all n x n symmetric matrices forms a subspace of M,,(F). For 
all matrices A, AA‘ is a symmetric matrix. For a square matrix A, A + A‘ isa 
symmetric matrix. Product of two symmetric matrices is symmetric if and only if 
they commute. 

5. Skew symmetric matrix. A matrix A is called a skew symmetric matrix if 
A’ = —A. For example, the matrix 


0 32 
—300 
—200 


is a skew symmetric matrix. It follows from the Proposition 2.1.3 that sum of two 
skew symmetric matrices are skew symmetric, scalar multiple of a skew symmetric 
matrix is askew symmetric matrix. Thus, the set SS, (F) of alln x n skew symmetric 


2.2 Types of Matrices of 


matrices forms a subspace of M,(F). A — A‘ is skew symmetric for all square 
matrices A. Product of two skew symmetric matrices is skew symmetric if and only 
if they anti commute in the sense thatA -B = —B.- A. Also observe that the diagonal 


entries of a skew symmetric matrices are 0. 
Every square matrix A with entries in a field F can be uniquely represented as 


£ SAE Ps . t . : 
sumA = “44 + 4 of asymmetric matrix “4* and a skew symmetric matrix 
A-A! 


—— (prove the uniqueness of the representation). 

6. Hermitian matrix. A matrix A with entries in the field C of complex numbers is 
called a hermitian matrix (also termed as self adjoint) if A* = A. Thus, a matrix 
A with real entries is Hermitian if and only if it is symmetric. The matrix 


1 3+i2 
Sh. 2 
2 —-i 3 


is a Hermitian matrix. Evidently, all diagonal entries of Hermitian matrices are real. It 
follows from the Proposition 2.1.3 that sum of two Hermitian matrices are Hermitian. 
However, only real scalar multiple of a Hermitian matrix is a Hermitian matrix. For 
all matrices A, AA* is a Hermitian matrix. For a square matrix A, A + A* is alsoa 
Hermitian matrix. Product of two Hermitian matrices is Hermitian if and only if they 
commute. 

7. Skew-Hermitian matrix. A matrix A with entries in the field C of complex 
numbers is called a skew-Hermitian matrix if A* = —A. Thus, a matrix A with 
real entries is skew-Hermitian if and only if it is skew symmetric. The matrix 


i 31-1 2 
3i+1 2 —-1 
2 1 3 


is a skew-Hermitian matrix. Evidently, all diagonal entries of skew-Hermitian matri- 
ces are purely imaginary. It follows from the Proposition 2.1.3 that sums of two 
skew-Hermitian matrices are skew-Hermitian. However, only real scalar multiple 
of a skew-Hermitian matrix is a skew-Hermitian matrix. Observe that a matrix A is 
skew-Hermitian if and only if iA is a Hermitian matrix. For all matrices A, iAA* is 
a skew-Hermitian matrix. For a square matrix A, A — A* is also a skew-Hermitian 
matrix. Product of two skew-Hermitian matrices is skew-Hermitian if and only if 
they anticommute in the sense that AB = —BA. 

Every square matrix A with entries in the field C of complex numbers can be 


uniquely represented assum A = 4¢@ 4+ oe of a Hermitian matrix 44", and a 


2 
skew-Hermitian matrix aan (prove the uniqueness of the representation). In turn, it 
follows that every square matrix A with entries in the field C of complex numbers can 
be uniquely represented as A = B + iC, where B and C are Hermitian matrices. 
8. Nonsingular matrices. A n x n matrix A is called a nonsingular matrix (also 


called an invertible matrix) if there isan x nmatrix BsuchthatA-B = [, = B-A. 


38 2 Matrices and Linear Equations 


Note that such a B, if exists, will be unique, for if B; and Bz are such matrices, then 
B, = B,-I, = B,-(A-Bo) = (B,-A)- Bo = I, - Bo = Bo. IfAisan invertible 
matrix, then the unique B such thatA-B = J, = B-A is called the Inverse of A, 
and it is denoted by A~!. Following are some simple observations: 

(i) The identity matrix J, is invertible and J>! = I,. 


(ii) Consider a diagonal matrix diag(a,, Q2, ..., Q,). As already observed in 2, 
diag(a}, 2, ..., Q)- [aj] = [by], where bj = ajay. Thus, diag(a;, a2, ..., 
Qy)- [aj] = I, if and only if ajay = 1 forj = i, and O other wise. This is so if 
and only ifa; 40, aj; = a! for each i, and aj; = 0 for alli 4 j. This shows that 
Diag(aj, Q2,..., @n) is invertible if and only if each a; 4 0, and then its inverse is 
Diag(a;', Oly. aides a;,'). 


(iii) Let A and B be invertible n x n matrices. Then, (AB)(B-'A7'!) = I, = 
(B~'A~')(AB). This shows that AB is also invertible and (AB)~! = B-!A7!. 
In due course, we shall describe an algorithms to check if a matrix is invertible, 

and then to find its inverse. 
9. Triangular matrices. A square matrix A is said to be an upper (lower) triangular 
matrix if all its below (above) diagonal entries are 0. More precisely, an x n matrix 
A = [aj] is called an upper (lower) triangular matrix if aj = Oforalli > j(i < 
J). It is called strictly upper (lower) triangular if in addition to that all the diagonal 
entries are also 0. For example, 

146 

020 

003 


is an upper triangular matrix. 

Clearly, the sum of any two upper (lower) triangular matrices is an upper 
(lower) triangular matrix. Also a scalar multiple of an upper (lower) triangular 
matrix is a upper (lower) triangular matrix. Thus, the set T,(n, F)(T_(n, F)) of 
upper (lower) triangular matrices forms a subspace of M,,(F). 

Further, 7; (n, F)(T_(n, F)) is closed under matrix multiplication: For, let A 
[aj] and B = [b,x] be upper triangular matrices. Then aj = 0 = Dy for alli > 
j > k.LetA-B = [cx]. Then ex = Yjaybyx = Oforalli > k. 

Next, let A = [aj] € T,(n, F) be a nonsingular matrix. Then there is a matrix 
B = [bj] such that B.A = [,. Equating the first row first column entry from both 
side we get bj,;a,, = 1.Butthena,;; AOandb,,; = ae Equating second row first 
column entry, we obtain that b7;a;; = 0.Henceb2; = 0. Similarly, equating ith row 
1,, column entry we obtain that bj;a;; = 0,andsobj; = Oforalli > 1. Equating 
the 1,,; row 2,¢ column entry, we get that bj;a;2 + bj2a22 = 0, and equating the 
2nd TOW 2,q column entry, we get bo2da22 = 1. Thus a227 40, bo = iss; and 
by = Gs Gy, Ce: Proceeding in this way we obtain that all the diagonal entries a; 
of A are nonzero, and then we can solve bj to get the inverse of A. We also observe 
that the inverse of A is also a member of T,.(n, F’). For example, consider the upper 
triangular matrix 


2.2 Types of Matrices 


246 
020 
003 


all of whose diagonal entries are nonzero. We find its inverse. Suppose that 


Q\1 {2 a3 246 100 
a2 422 A23 -1020 => 010 
31 32 33 003 001 


Then we have the following equations: 
2a, = 1, 4ay) + 2a12 = 0, bai; + 3a13 = 0, 
2a2, = 0, 4a; + 2ax2 = 1, 6a2; + 3a23 = 0, 


2a3\ => 0, 4a3, + 2a32 => 0, 3433 1 
Solving, we get that a,; = 7 a2 = —l = a3, a2) = 43) = Ayn = 0, a3 
0, an = $, a3 = +. Thus, the inverse of the said matrix is 
2 “1 5 
2 
00 4 


3 


Block multiplication 


39 


We can multiply two matrices by using suitable blocks of their submatrices. More 


explicitly, let A be am x n matrix, and Ban x p matrix. Suppose that m = 


m+ 


my +--+ +m, n = my +ng+---Ns, and pp = pi + pr+--++pr, where mj, nj 


and px are positive integers. Then A and B can be expressed uniquely as 


Ay Aiz >> + Als 

Az) Azz + + + Ads 
a Gf. See Bey . 

Ay A,2 ees Ars 

where Aj is am; < nj matrix and 

By, By - ++ By 

By, Bon +» By 
i= foot Bee ol 

Bs By at By 


where Bix is nj x px matrix. Further, then 


40 2 Matrices and Linear Equations 


Cy Ciz- ++ Ci 

C12 Cra - + + Cry 
Ae . 2 are | 

Cr Cia Cr 


where Cix = U7_ AjBjx- 


2.3 System of Linear Equations 


A system of m linear equations in 7 unknowns x,, x2, ..., X, over a field F is given 
by 
aX, + ay2X2 +++ + + AimXm = D4 
2X1 + Ao2X2 +++ + + AamXm = b2 
; (2.1) 
Ani X1 + Gn2X2 + + + AnmXn = Dn 


where aj € F. 


Example 2.3.1 Following is a system of two linear equations in three unknowns over 
the field of real numbers: 
3x, + 2x. + x3 = 1. 


xX) + x + x3 = 2. 

We say that an-tuple (a), a2, ..., d,) in F” is asolution of the system (2.1) of linear 
equations if x; = a), xX. = d2,...,X, = dy Satisfies all the equations in the system 
(2.1). Thus, (—2, 3, 1) is a solution of the system of linear equations in the above 
example. (—3, 5, 0) is also a solution to the above system. Indeed, there are infinitely 


many solutions which can be parametrized in terms of x3 as (x3 — 3, 5 — 2x2, x3). 
Clearly, this represents a line. 


Example 2.3.2. The system 
xy + 2x. + 3x3 = 1. 
Xy + x + 3x3 = 2. 
4x; + 6x. + 12x3 = 5. 


of linear equations has no solution (why?). 


2.3 System of Linear Equations 4l 


where as 


Example 2.3.3 The system 
xX, + 2x. = 1. 


2x, + 2x. = a. 
has a unique solution for all a (why?). 


Definition 2.3.4 A system of linear equations is said to be consistent if it has a 
solution. It is said to be inconsistent otherwise. 


The Example 2.3.1 is consistent having infinitely many solutions, the Example 
2.3.2 is inconsistent, whereas Example 2.3.3 is consistent with unique solution. 

Most of the problems in real life, in engineering, in industries, in social life, 
and in medical science can be modeled in terms of systems of linear equations. 
As such, describing and interpreting the solutions of a system of linear equations 
is one of the main themes of linear algebra. In the following few sections we shall 
concentrate on this. 

The system (2.1) of m linear equations in n unknowns can be expressed in a single 
matrix equation 


st 


Ax’ = b (2.2) 
where A = [a,j] is the m x n matrix whose i,, row j,, column entry is aj, X¥ = 
[X1,%2,...,X,] € F” isthe 1 x nrow matrix of unknowns, andb = (by, bo, ..., Dm] 


€ F” is the 1 x m matrix. 
Thus, the system of linear equations in Example 2.3.1 can be expressed as 


S207) ol, .. Ft 
Lia) |) =~ 
X3 
The matrix A in (2.2) is called the coefficient matrix of the system (2.1) of linear 


equations, and the m x (n+ 1) matrix At = [A D | whose first n columns are those 


of A, and the last (n+ 1),, column is D, is called the augmented matrix of the 
system of linear equations. 
Thus, the coefficient matrix of the Example 2.3.2 is 


12 3 
113 |, 
4612 


and the augmented matrix of the example is 


42 2 Matrices and Linear Equations 


2461 
0202 
0035 


Definition 2.3.5 A system of linear equations given by the matrix equation 


at 


Av =0.... (2.3) 


is called a homogeneous system of linear equations. It is also called the homoge- 
neous part of the system of linear equations given by 


t 


Ax’ = b. 


Proposition 2.3.6 A homogeneous system of linear equations given by the matrix 
equation 
=t 


Ax = 0. 


is always consistent, and the set of solutions of the homogeneous system is a subspace 
of F". 


Proof Let N(A) denote the set of all solutions of Ax’ = 0. Since AO’ = 0, 
it follows that 0 € N(A). Let a, 0 € N(A), and a,b € F. Then A(au + bv)! = 
aAu’ + bAv' = 0. This shows that aa + bu e€ N(A). It follows that N(A) is a 
subspace of F”. tt 


Definition 2.3.7 The subspace N(A) described in the above proposition is called 
the solution space of the system (2.3) of linear equations, and it is also called the 
null space of the matrix A. The dimension of the null space N(A) is called the 
nullity of A, and it is denoted by n(A). If {a7, tu, ..., Uncay} is a basis of N(A), then 
any solution of (2.3) is uniquely expressed as cy; + C22 + +++ + Cy(ayUnca), Where 
C1, €2,..., ya) are constants in F. As such cy + cruz + +++ + Cn(ayUn(ay 18 called 
a general solution of the homogeneous system (2.3). 


A little later, we shall give an algorithm to find (A), indeed a basis of N(A), and 
so also a general solution of the system (2.3) of linear equations. 


Proposition 2.3.8 Suppose that the system of linear equations given by the matrix 


equation 
t 


Ax’ = b. 
is consistent, and @ = [d\,Q2,...,4,] is a solution of the above equation. Then 
the coseta + N(A) = {a + u | u€ N(A)} is the complete set of all solutions 
of the system of linear equations. In turn, if (uy, U2, ..., Una} is a basis of N(A), 


thenad + cyuy + c2Uz + +++ + Cy(ayUn(ay tS a general solution of the system of linear 
equations, where C1, C2, ..., Cay are arbitrary constants. 


2.3 System of Linear Equations 43 


t 


Proof Since @ is a solution of AX’ = D, Ad = b.Ifue N(A), then A’ = 0. 
But, then A(@ + m7)! = (Aa + Aa’) = (6 +0) = B. This shows thata + 7 


is also a solution of Ax’ = BD. Conversely, let € be a solution of Ax’ = D. Then 
Ac’ = b . Hence A(é — a@)' = (Ac — Aa’) = 0. It follows that (¢ — @) € N(A). 
This shows that € € @ + N(A). The rest is an immediate observation. tt 


Definition 2.3.9 The subspace R(A) of F” generated by the set {R| (A), R2(A),..., 
R,,(A)} of the rows of A is called the row space of A, and the dimension of R(A) is 
called the row rank of A. Thus, the maximum number of linearly independent rows 
of a matrix is the row rank of A. Similarly, the subspace C(A) of F’” (the elements 
of F” treated as columns) is called the column space of A, and the dimension of 
C(A) is called the column rank of A. Again, it follows that the maximum number 
of linearly independent columns of A is the column rank of A. We shall see, in due 
course, that row rank is same as column rank, and it is called the rankof A. The rank 
of A is denoted by r(A). 


Proposition 2.3.10 The system of linear equations given by the matrix equation 


t 


Ax’ = 5b. 


is consistent if and only if the column rank of A is same as that of the augmented 
matrix At. 


Proof The system of linear equations given by the matrix equation Ax’ = D is also 
expressible as 


xyC\(A) + x2C2(A) + +++ + nC, (A) = B. 


whereX = [x], X2,...,X,], and C;(A) denotes the i, column of A. Thus, the equation 


has a solution if and only if D is a linear combination of the columns of A. This is 
equivalent to say that the column space C(A) of A is same as the column space C(A*) 
of the augmented matrix A*. Since C(A) € C(A‘), this is equivalent to the fact that 
column rank of A is same as that of A’. tt 


We shall look at an algorithm to find the rank of a matrix, and also an algorithm 
to find a general solution of Ax’ = b provided it is consistent. 


2.4 Gauss Elimination, Elementary Operations, Rank, 
and Nullity 


Definition 2.4.1. Two systems of m linear equations in n unknowns are said to be 
equivalent if they have same set of solutions. 


44 2 Matrices and Linear Equations 


Example 2.4.2. The system 
xy + 2% = 1 


2x, + 2x =a 
of two linear equations in two unknowns is equivalent to the system 
xy + 2x, = 1 
3x, + 4m = atl, 
for they have same set of solutions, whereas the system is not equivalent to 
xX, + 2x. = 1 
2x, + 3x. =a 


In what follows, we shall introduce an algorithm called the Gaussian elimination 
to reduce a system of linear equations into an equivalent system of linear equations 
from which the solution will become apparent. 


Definition 2.4.3 Following operations on a system of linear equations are called the 
elementary operations on the system of linear equations, and the corresponding 
operations on coefficient and augmented matrices are called the Elementary row 
operations on the matrices: 


1. Interchange any two equations in the system. 

2. Multiply an equation in the system by a nonzero member of the field. 

3. Add a nonzero multiple of an equation in the system to another equation in the 
system. 


In turn, the corresponding elementary row operations on matrices are: 


1. Interchange any two row of the matrix. 
2. Multiply a row of the matrix by a nonzero element of the field. 
3. Add a nonzero multiple of a row of the matrix to another row. 


The following proposition is an immediate observation. 


Proposition 2.4.4 Any two system of linear equations which differ by a finite 
sequence of elementary operations are equivalent. tt 


We shall first discuss an algorithm to find the space of solutions of a homogeneous 
system of linear equations given by the matrix equation Ax’ = 0’. More precisely, 
we derive an algorithm to find a basis of the null space N(A) of A so that every 
solution of the system is unique linear combination of the basis members. 


Proposition 2.4.5 The null space N(A), and so also the nullity n(A) of a matrix A 
remain invariant under the elementary row operations. 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 45 


Proof Follows from the Proposition 2.4.4. ft 


Proposition 2.4.6 The row space R(A) and so also the row rank of a matrix A remain 
invariant under the elementary row operations. 


Proof Interchange of any two rows of a matrix will not change the row space as 
the set of rows will not change. Since the subspace of F” generated by the set 
{R (A), R2(A), ..., Rn(A)} of rows of A is the same as the subspace of F” generated 
by {R)(A), Ro(A),..., GR)(A), ..., Rm(A)} for each nonzero a € F and j < m, it 
follows that the row space of a matrix remains the same if we multiply a row of the 
matrix by a nonzero member of the field. Finally, since the subspace of F” generated 
by the set {R) (A), Ro(A),..., Rn (A)} of rows of A is the same as the subspace of 
F" generated by {R1(A), R2(A),..., Rx(A) + GR;(A), ..., Rin (A)} for each nonzero 
aéF andj #k, it follows that the row space of a matrix remains the same if we 
add a nonzero multiple of a row to another row. tt 


The column space of a matrix, in general, is not invariant under elementary row 
operations. However, 


Proposition 2.4.7 The column rank of a matrix remains invariant under elementary 
row operations. 


Proof Let A be a matrix, and A’ a matrix obtained by applying any of the elementary 
row operations on A. Then evidently, 


xC;,(A) + Ci, (A) +--+ + 4 C, (A) = 0 
if and only if 
x1Ci, (A) + Ci, (A) + -- + 4,0, (4) = 0 


This means that the maximum number of linearly independent columns of A is same 
as that of A’. Thus, the column rank of A is same as that of A’. tt 


We shall describe an algorithm to transform a matrix in to a special form, called a 
reduced row echelon form, of the matrix by using elementary row operations, and 
from which a basis for the null space of the matrix, and also a basis of the row space 
of the matrix can be easily obtained. 


Definition 2.4.8 Am xn matrix A = [aj] is said to be a matrix in reduced row 
(column) echelon form, or it is said to be a reduced row echelon matrix if the 
following hold: 


(i) The first nonzero entry in each row (column) is 1. This entry is called a pivot 
entry, and the corresponding columns (rows) are called pivot column (row) of 
the matrix. The columns (rows) which are not pivot columns (rows) are called 
the free columns (rows). The unknown variable corresponding to pivot columns 
are called pivot variables, and those corresponding to free columns are called 
free Variables. 


46 2 Matrices and Linear Equations 


(ii) The pivot entry in any row (column) is towards right (bottom) side to the pivot 
entry in the previous row (column). 

(iii) All of the rest of the entries in a pivot column (row) are 0. 

(iv) All the zero rows (columns) are towards bottom (right). 


Example 2.4.9 The matrix 
12002 
00101 
00012 
00000 


is in reduced row echelon form. The Ist row Ist column, the 2nd row 3rd column, 
and the 3rd row 4th column entries are pivot entries, 2nd and 5th columns are free 
columns. x;, x3 and x4 are pivot variables. x. and x5 are free variables. 


Proposition 2.4.10 LetA be am x n matrix with entries in a field F and which is in 
reduced row echelon form. Suppose that the columns C;,(A), Cj,(A), ..., Ci,(A) with 
i) <in<-+-+<i, are pivot columns and the columns’ C;,(A), CG, 
(A),..., Gj, (A) with j) <j2 < +++ <j, are free columns. Then, 


(i) the first r rows R\(A), R2(A),...,R,(A) are nonzero rows, and they form a 
basis of the row space R(A) of A, 
(ii) the number of pivots is the row rank of A, 
(iii) the pivot columns form a basis of the column space of A, 
(iv) row rank of A is the same as the column rank of A. Indeed, it is the number of 
pivots. 


Proof (i) Since each nonzero row contains a unique pivot entry, and the zero rows 
are towards the bottom, it follows that R)(A), Ro(A),...,R-(A) are precisely the 
nonzero rows of the matrix. Since the pivot entries 1 in Rj (A), Ro(A),..., R,(A) 
appear in different columns ij, i2,..., i, it follows that the set {R| (A), Ro(A),..., 
R,(A)} of nonzero row of A is linearly independent. As such, it forms a basis of the 
row space R(A) of A. 

(ii) Follows from (i). 

(iii) Clearly, the set {C;, (A), Ci,(A),..., Ci,(A)} of pivot columns form a linearly 
independent set, for the ky, row entry in the pivot column C;, (A) is 1 and the rest of 
the entries in this column are 0. It is also evident that all the free columns are linear 
linear combinations of the pivot columns. Indeed, 


C (A) = ayj,Ci,(A) + a2,C, (A) + +++ + ay, Cj, (A). 
(iv) Follows from (ili). tt 


Proposition 2.4.11 Consider the homogeneous system of linear equations given by 
the matrix equation 
Ax’ = Oo" 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 47 


where A is a reduced row echelon m x n matrix with entries in a field F. Sup- 
pose that the columns C;,(A), C;,(A), ..., C;,(A) with i; < iz < +--+ <i, are pivot 
columns, and the columns Cj, (A), Cj, (A), ..., G,(A) with j, < jz < +--+ < js are free 
columns. Then the pivot variables x;, , Xi, ... , Xi, inthe homogeneous system of linear 
equations are uniquely expressible in terms of free variables x;,, Xj, ... , Xj, a8 


s 
Xi, = -) At, Xjx+ 
k=1 


The set {u', u2,..., Whisa basis for the space N (A) of solutions of the homogeneous 
system, where uk = (uk, uk, ...,U*) is the unique solution of the homogeneous 


system corresponding to the choice x;, = 0, 1 4 k, and x;, = | of the free variables. 
Indeed, ui = Oforl Ak, ui = l,and us = —dy,,. The nullity n(A) = s, the number 
of free variables. 


Proof Under the assumption, for all tf <r, a,;, = 1 and a;, = O for] ¢t. The 
corresponding homogeneous system of linear equations is given by 


Ai Xiy F Ay Xj, + ApXp_ +++ + ayxj, = O. 
A2,Xi, a2;, Xj, + 2}, Xj “pose op a2;,Xj, = 0. 
Ari, Xi F Arf, Xj + ArjyxXjy + +++ + Ay,Xj, = 9. 


the rest of the equations, if any, are the identities 
Ox; + Oxo + --- + Ox, = 0. 


Evidently, each pivot variable is uniquely expressible in terms of free variable as 
described in the proposition. Further, the set S = {u!,u?,..., u*} of solutions is 
a basis of the space N(A) of solutions, for any solution with values a1, Q2,..., Qs 
to the free variables x;,, x;,,...,2j, 18 uniquely expressible as linear combination 


ayu! + aou? + ---, a us. The rest is evident. tt 


Proposition 2.4.12 Consider the system of linear equations given by the matrix 
equation 
=f 


Ax’ = b, 


where A is a reduced row echelonm x n matrix with entries in a field F. Suppose that 
the columns C;, (A), C;,(A), ..., Ci, (A) withiy < ig < +++ < i, are pivot columns and 
the columns 


48 2 Matrices and Linear Equations 


Cj, (A), G, (A), ..., Gj, (A) with jf, < jz < +++ < js are free columns. Then the sys- 


tem of linear equations is consistent if and only if bk = O for allk>r+1, 
or equivalently, rank(A) = rank(A*). Further, then 0 = (v1, 02,...,U,), where 
vi, = —ayj, +b, 1<t<r,vu, = Landv;, = 0,2 <1<s, is a particular 


solution of the above nonhomogeneous system. Finally, a general solution x of the 
system of linear equation is given by 


X= D+ cu! + cow +--+) + CU, 
where C1, C2, ..., Cs are arbitrary constants. 


Proof From the previous proposition, a general solution of the homogeneous part of 
the above nonhomogeneous system of linear equations is given by 


cul + cou? + +++ + csu. 


Further, the system of linear equations is given by 


BX, + AyyXjy + AipyXp_ +--+ + ay,xj, = Or. 
AriXig + Arj Xj + ArpyXp + +++ + ay4,Xj, = bp. 
Api Xi, Ay Xj, + ArjpXp + v++ + AyXj, = Br 


The rest of the equations, if any, are the identities 
Ox, + 0% + .--- + Ox = h, kK>rtl. 


Clearly, the system is inconsistent if b, # 0 for any k > r+ 1. Now, suppose that 


b, = 0 for all k >r-+ 1. Putting the free variable x, = 1, and x; = 0 for 
2<k <s, we geta particular solution’ = (v1, U2,..., U,), where vj, = —ay, + 
b,, |<t<r,v;, = l,andv, = 0,2 <1 <-s of the system. From the Proposition 


2.3.8, we get a general solution 
X= D+ cu! + cow +--+. + ow 
of the system, where c), C2, ..., Cs are arbitrary constants. tt 


Example 2.4.13 Consider the system of linear equations given by the matrix equation 


st 


Ax’ = b, 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 49 
where A is the matrix given by 


10102 
01101 
00011 
00000 


The corresponding system of linear equations is given by 
xy + Oxg + x3 + Ong + 2x5 = Dy. 
Oxy + x2. + x3 + Oxy + x5 = Do. 
Ox; + Ox. + Ox3 + x4 + x5 = Dz. 
Ox; + Ox. + Ox3 + Oxg + Ox5 = Dy. 


The matrix A is in reduced row echelon form with the pivot columns C), C2, C4, 
and the free columns C3 and Cs. The nonzero rows Rj, R2, R3 form a basis of row 
space, and the pivot columns C), C2, Cy of A form a basis of the column space of 
A. Row rank = 3 = Column rank of A. For the system to be consistent by = 0. 
Assuming that bh = 0, we find a general solution of the system. We first find a 
basis of the solution space N(A) of the homogeneous part Ax’ = 0 of the given 
system of linear equations. x3 and xs are free variables. Putting x3 = | andx4 = 0, 
we get a solution wu! = (—1, —1, 1, 0,0) of the homogeneous part of the system. 
Further putting x; = Oandx5 = 1, we get a solution w= (—2, —1,0, —1, 1) of 
the homogeneous part of the system. The set {u', u2} is a basis of the space N (A) of 
solutions of the homogeneous part. Nullity of A is 2. Finally, putting x, = 1| and 
x4 = O, we get a particular solution’ = (—1+ b;, —1+ bo, 1, b3, 0) of the given 
nonhomogeneous system of linear equations. In turn, a general solution of the given 
nonhomogeneous system of linear equations is U + cyu! + cou?. 


Observe that a square matrix in reduced row echelon form has no zero rows if 
and only if all the rows, and so also all columns have pivots, or equivalently, it is the 
identity matrix. Since a matrix with a zero row is singular, we have the following: 


Proposition 2.4.14 A square matrix in reduced row echelon form is nonsingular if 
and only if it is the identity matrix. tt 


Elementary operations on a system of linear equations, or equivalently, ele- 
mentary row operations on the coefficient and augmented matrices, transform 
the system into equivalent system of linear equations. Further, if the coefficient 
matrix of the system of linear equations is in reduced row echelon form, then as 
observed above, a general solution of the system is easily obtained. As such, it is 
prompting to discover, if possible, an algorithm to reduce an arbitrary matrix 


50 2 Matrices and Linear Equations 


in to a matrix in reduced row echelon form by using elementary row operations. 
The following theorem gives an algorithm. 


Theorem 2.4.15 Using elementary row operations, every matrix can be reduced to 
a matrix in reduced row echelon form. 


Proof Let A be am x n matrix. If A is the zero matrix, then it is already in reduced 
row echelon form. Suppose that A is nonzero matrix. Let j, be the least number such 
that the column C;j, (A) is a nonzero column. Further, let i; be the smallest number 
such that a;,;, 4 0. Interchanging the i; th row and the first row, we may assume that 
aj, #0, and ay = O for all k < j,. Multiplying the first row by ais we may 


assume that a;;, = 1,anda,x = Oforallk < j,. Next, adding —aj;j, times the first 
row to the ith row for each i > 2, we reduce A to a matrix [aj], where aj;, = 1, 
aj, = O for all i> 2, and ay = O for all k <j, — 1. If in this reduced matrix 


aj = O for alli > 2, then it is already in reduced row echelon form. If not, let jz be 
the smallest number such that a;;, 4 0 for some i > 2. Further, let iz be the smallest 
number greater than 2 such that qj,;, A 0. Note that j2 > j. Interchanging the ijth 
row and the second row, we may assume that a2;, A 0. Then multiplying the second 
row by ee we may assume that a2;, = 1. In turn, adding —aj, times the second 
row to the ith row for each i 4 2, A may have been reduced to a matrix in reduced 
row echelon form. If not, proceed as before. This process reduces A in to reduced 
row echelon form after finitely many steps (if worst comes, at the n,, step). tt 


Corollary 2.4.16 Row rank of a matrix is the same as the column rank of the matrix. 


Proof From the Proposition 2.4.6, and the Proposition 2.4.7, row rank and column 
rank of a matrix are invariant under elementary row operations. From the Proposition 
2.4.10(iv), row rank of a matrix in reduced row echelon form is same as its column 
rank (equal to the number of pivots). Combining this with the Theorem 2.4.15, the 
result follows. tt 


Definition 2.4.17 Row rank of a matrix A, or equivalently, the column rank of a 
matrix is called the rank of the matrix. The rank of a matrix A is denoted by r(A). 


Corollary 2.4.18 Let A be am x n matrix. Then r(A) + n(A) = n. 


Proof Since the rank and the nullity remain invariant under elementary row opera- 
tions, using Theorem 2.4.15, itis sufficient to prove the result for matrices in reduced 
row echelon form. For a matrix A in reduced row echelon form, r(A) is the number 
of pivot columns and n(A) is the number of free columns. Clearly, a column is either 
a pivot column or a free column. tt 


Example 2.4.19 Consider the system of linear equations 
2x3 + 3x4 + 8x5 = 1. 


2x, + 4x. + x3 + 5x5 = 0. 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 51 
Xy + 2x. + x3 + x4 + 5x5 = 2. 
5x; + 10x. + 6x3 + 6x4 + 28x5 = a. 


The corresponding coefficient matrix A is 


0023 8 

24105 
ee ate a. 

5 10 6 6 28 

and the augmented matrix At is 

0023 81 
Me 241050 

121152 

5 1066 28a 


We discuss the consistency of the above system of linear equations, and if consistent, 
we determine a general solution. For the purpose, we reduce the coefficient matrix A, 
and also the augmented matrix A* to reduced row echelon forms simultaneously by 
using the algorithm described in the above theorem. The Ist column of A is nonzero, 
and the smallest number i for which a;; ¢ 0 is 2. Thus, interchanging the Ist and the 
2nd rows of A, and of At, A is transformed to 

> 
8 
5 |: 
28 
and A* is transformed to 

5 0 
8 1 
5 2 
a 


28 


12503 
00238 
12115 |’ 
5 10 6 6 28 


and to 


52 2 Matrices and Linear Equations 


125020 
002381 
121152 
51066 28a 


Further, adding —1 times the Ist row to the 3rd row, and adding —5 times the Ist 
row to the 4th row, the matrices are transformed to 


Dre WO 
N[Bvin conin 


and to 


ne WO 


N]Bvin Corin 
8S NRO 


Here, in this transformed matrix, aj. = Oforalli > 2. Thus, the 2nd column is a free 
column. We look at the 3rd column. The 2nd row 3rd column entry a23 = 2 40. 
We divide the 2nd row by 2 to get the pivot entry lin 2nd row 3rd column. The 
matrices, thus, reduce to 


j=) 

oOo 
NINNIR NI 
HD RSnlw © 
N]Bvin Aron 


00 


and to 


Re NIbn © 
N[Yvia Bron 
Q NNR O 


lon 


In turn, to make all other entries in this pivot column 0, we add —5 times the 2nd 
row to the Ist row, —$ times the 2nd row to the 3rd row, and -i times the 2nd row 


to the 4th row. The matrices reduce to 


120-34 
ool 5 4 
000 4 2 | 
000 7 3 


and to 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 53 


3 
120-33 -4 
ool 5 4 3 
000 | P i 

3 7 


The 3rd row 4th column entry a34 = t 4 0. We multiply the 3rd row by 4 to get 
the pivot entry | in 3rd row 4th column. Thus, the matrices further reduce to 


3 
120-35 
001 3 4 
000 1 2]’ 

3-3 
000 3 3 

and to 4 ' 
P2082 A 
00134 5 
00012 7 
3 3 
000 % sa-} 


In turn, we add 3 times the 3rd row to the Ist row, -3 times the 3rd row to the 2nd 
row, and the + times the 3rd row to the 4th row to make the rest of the entries in 
this pivot column 0. The coefficient matrix A reduces to the following matrix 


12002 
00101 
00012 
00000 


which is in reduced row echelon form, and the augmented matrix At gets transformed 
to 


12002 5 
00101 —10 
00012 7 
00000a—-7 


Thus, the given system of linear equations is equivalent to a system of linear equations 


whose coefficient matrix is 
12002 


00101 
00012 ]’ 
00000 


and the augmented matrix is 


54 2 Matrices and Linear Equations 


12002 5 
00101 —10 
00012 7 
00000a—-7 


In turn, using the discussions and the results above, we have the following: (i) A 
basis of the row space of A is {(1, 2, 0, 0, 2), (0, 0, 1, 0, 1), (0, 0, 0, 1, 2)}. The rank 
r(A) = 3. 

(ii) Putting the free variable x. = 1, and the free variable x5 = 0, we get a 
solution (—2, 1, 0,0, 0) of the homogeneous part of the system. Further, putting 
the free variable x. = 0, and the free variable x5 = 1, we get another solution 
(—2, 0, —1, —2, 1) of the homogeneous part of the system. The set {(—2, 1, 0, 0, 0), 
(—2, 0, —1, —2, 1)} is a basis of the solution space N(A) of the homogeneous part. 
A general solution of the homogeneous part of the system is 


ci (—2, 1, 0, 0, 0) a5 c2(—2, 0, I, —2, 1), 


where Cc}, C2 are arbitrary constants. 

(iti) The nonhomogeneous system is consistent if and only if 3 = r(A) = r(A*), or 
equivalently,a = 7. Then, giving the valuex, = 1,andx5 = Oof the free variables, 
in the nonhomogeneous system, we get a particular solution (3, 1, —10, 7, 0). Thus, 
a general solution of the nonhomogeneous part is 


(3, 1, -10, 7,0) + cy(—2, 1,0,0,0) + co(—2,0, —1, —2, 1), 


where ¢1, C2 are arbitrary constants. 


Definition 2.4.20 A square matrix E obtained by applying elementary row opera- 
tions on identity matrix is called an elementary matrix. 


Example 2.4.21 The matrix 
1000 
0100 
0030 
0001 


is an elementary matrix which is obtained by multiplying the 3rd row of the identity 
matrix J, by 3. The matrix 

0010 

0100 

1000 

0001 


is an elementary matrix which is obtained by interchanging the Ist row and the 3rd 
row of the identity matrix J,. Again, the matrix 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 55 


1030 
0100 
0010 
0001 


is also an elementary matrix which is obtained by adding 3 times the 3rd row of the 
identity matrix J, to its Ist row. 


7;; denotes the elementary matrix which is obtained by interchanging the i, row 
and the j,, row of identity matrix. Thus, 


The elementary matrix which is obtained by adding the \ times the j,, row of the 
identity matrix to its i, row is denoted by ER. Indeed, E} is the matrix all of whose 
diagonal entries are 1, the i,, row j,, column entry is 4, and the rest of entries are 0. 
Thus, 
1030 
0100 
0010 
0001 


_ £7 
> EX; 


The matrices E rf are called the transvections. 

It can be easily observed that the effect of multiplying an elementary matrix E 
from left (right) to a matrix A is applying the elementary row (column) operation 
on A which was used to get the matrix FE from the identity matrix. Thus, 7A is the 
matrix obtained by interchanging ij, row and j,, row of A, and ESA is the matrix 
obtained by adding 4 times the j,, row of A to its i, row. It is straightforward, in 
particular, to verify the following relations, called the Steinberg relations, among 
the transvections in M,,(F). 


@ E}- ER = oa i #j. In particular, E} -E;* = E} = iy, Thus, E} is 
invertible, and its inverse is Ej; rou 
(ii) Fori A 1,j Ak, Ej and Ey, coriiiites 


(iii) For i 4 1, (E\EVE, EM, Hy) = EM, 


= L LX 
(iv) For j # k, (ERE, EME eke: 


Proposition 2.4.22 LetA beam x nmatrix. Then, we can find anonsingularm x m 
matrix P such that PA is a matrix in reduced row echelon form. In particular, a square 
matrix A is nonsingular if and only if its reduced row echelon form PA is the identity 
matrix. 


56 2 Matrices and Linear Equations 


Proof Applying an elementary row operation on A is equivalent to multiply A from 
left by an elementary matrix. Since every matrix can be reduced to a matrix in 
reduced row echelon form (Theorem 2.4.15), multiplying A successively by ele- 
mentary matrices from left we arrive at matrix in reduced row echelon form. Since 
elementary matrices are nonsingular, and product of nonsingular matrices are nonsin- 
gular, we get a nonsingular matrix P such that PA is a matrix in reduced row echelon 
form. Since P is nonsingular, A is nonsingular if and only if PA is nonsingular. From 
the Proposition 2.4.14, A is nonsingular if and only if PA is the identity matrix. { 


The above discussion and the results give an algorithm to determine a nonsingular 
matrix P such that PA is a reduced row echelon matrix. In particular, it gives an 
algorithm to check if a square matrix A is invertible, and if so, to find the inverse of 
A. We further illustrate the algorithm by means of examples. 


Example 2.4.23 Consider the matrix 


003 1 2 

012 00 

021-11 

1 

O11-—;0 
Using the elementary row operations, we transform the matrix A in to a matrix in 
reduced row echelon form, and simultaneously find a nonsingular matrix P such that 
PA is a matrix in reduced row echelon form. We start with the pair 


1000 003 12 
0100 01200 
= bog agi? = e111 
0001 Oli! ¢ 


3 


There is no nonzero entry in the first column of A, and so no pivot will appear in the 
first column. We leave and move to the second column. The first nonzero entry in 
the second column of A is 1, and it is in the second row. We interchange the first row 
R, and the second row R> in the pair of matrices. The pair, thus, gets transformed to 
the pair (E,, A,) given by 


0100 01200 
1000 603 12 
A=! g910 (27! = | 021-11 
0001 011-30 


(note that E}|A = Aj). The entry 1 in the first row and second column of A, is the 
pivot entry. To make the rest of the entries in this pivot column 0, we replace R3 by 
R; — 2R,, and then Ry by Ry — R,. In turn, the pair (E;, A,) gets transformed to the 
pair (E>, Az) given by 


2.4 Gauss Elimination, Elementary Operations, Rank, and Nullity 57 


0100 012 00 
1 0 00 003 12 
= | 9-310)**2 > | go-3-11 
0-101 00-1-30 


(Again note that ExA; = E,E;A = Az). The second row third column entry is 3 
which is nonzero. We replace R> by & to make it a pivot entry 1, and in turn, we 
replace R; by Rj — 2R2, R3 by R3 + 3Ro, and Ry by R4 + Ro to make all the rest of 
the entries in this pivot column 0. Thus, the pair (E2, Az) is transformed to the pair 
(E3, A3) given by 


-- 100 010-2 -% 

1 1 2 

7 000 oo1 ¢ 2 

= 3 = 3. 2 
BSN aphG |e = | OOO O S 
; -101 0000 


(Again, note that E3A2 = A3). Since the 3rd row 4th column, and 4th row 4th column 
entries are 0, there is no pivot in the 4th column, it is a free column. We go to the 5th 
column. The 3rd row 5th column entry is 3 which is nonzero. We replace R3 by ER3 to 
make the 3rd row 5th column entry a pivot entry 1, and then replace R; by R; + +R3, 
Ro by R2 — =R3, and Ry by R4 — =R3. Thus, the pair (E3, A3) is transformed to the 
pair (E4, Aq) given by 


ye 4 2 

i 1 at PG la 00001] |’ 
1 5 2 
5 5 -21 00000 


where A, 1s in reduced row echelon form, and P = £4 is an invertible matrix such 
that PA = Ag is in reduced row echelon form. 


Example 2.4.24 Consider the matrix A given by 


013 
102 
021 
111 


We apply the following elementary row operations in succession. 

(i) Interchange R; and Ro, 

(ii) replace Ry by Ry — R,, R3 by R3 — 2Ro and Ry by Ry — Ro, 

(iii) replace R3 by — £R3, R, by R; — 2R3, Ro by Ry — 2R3 and Ry by Ry + 3R3. 
on A, and also on J4. Then, A reduces to the reduced row echelon form 


58 2 Matrices and Linear Equations 


100 
010 
001 }’ 
000 
and J, reduces to 

4 2 
Gg 

P= 3 3 
2. O-70 
1 -1-11 


Thus, PA is in the row echelon form given above. 


Example 2.4.25 Consider the 3 x 3 matrix A given by 


111 
A= | 123 
149 


If we use the method of the above example, then A reduces to the identity matrix I, 
and J; reduces to 


Thus, A is invertible, and PA = J3. Hence P 1s the inverse of A. 


2.5 LU Factorization 


If the coefficient matrix of a system of linear equations is upper triangular square 
matrix U with nonzero diagonal entries, then the solution is easily obtained by 
inspection. For example, if a system of linear equations is given by the matrix equation 
Ux' = b, where 
111 
U= 1023), 
009 


and b = (bi, bz, b3), then, evidently, the solution is (18b; — 3 + “8, 22>%, &), 
Similarly, it is also easy to solve a system of linear equations whose coefficient matrix 
is lower triangular square matrix with nonzero diagonal entries. Further, suppose the 
coefficient matrix A is invertible, and it is expressed as A = LU, where L is a lower 
triangular matrix, and Uis an upper triangular matrix. Then, we first find the solution 


vof Uy = D, and then the solution @ of Ux' = v’. Clearly, 7 is the solution of 
—t 


Ax’ = b. 


2.5 LU Factorization 59 


The above discussion prompts us to look at the problem of factorizing an invertible 
matrix A as a product LU of a lower triangular matrix L and an upper triangular 
matrix U. This, in general, is not possible. 


Example 2.5.1 Suppose that 


Ol]  |ad uv 
ol 7 bearer 
Then au = 0, av = 1, bu = 1. This, however, is impossible. This shows that 
the invertible matrix 
01 
in 


cannot be expressed as product of a lower triangular and an upper triangular matrix. 


Observe that the matrix 
111 


003 
029 


is also not expressible as product of a lower triangular and an upper triangular matrix. 


The reason behind the impossibility of expressing the above matrices as product 
of lower and upper triangular matrices is while reducing these matrices in to reduced 
row echelon forms, we are forced either to interchange rows, or to add a nonzero 
multiple of a kj, row to 1, row for some k > JI. Equivalently, we need to multiply 
from left by a corresponding elementary matrix 7;;, or by a corresponding matrix 
Ey Obviously, these matrices are not lower triangular matrices. Indeed, if, while 
reducing A in to reduced row echelon form, elementary row operations of the above 
type are not needed, then we can find a lower triangular matrix P with diagonal 
entries 1 so that PA is upper triangular. In turn, A = LU, where L = P~!. We 
illustrate it by means of examples. 


Example 2.5.2 Consider the matrix A given by 
111 


A= |]123], 
149 


and the system of linear equations given by the matrix equation 
Ax’ = [1, 2, 3)’. 
Adding —1 times the Ist row of A to the 2nd row, and then adding —1 times the Ist 


row to the 3rd row, or equivalently, multiplying the matrix E Fee cy to A from left, 
we obtain that 


60 2 Matrices and Linear Equations 


111 
Ee EA = | 012 
038 


Again, adding —3 times the 2,7 row of the above matrix to its 3,4 row, or equivalently, 
multiplying Ee to E;;'E;'A from left, we obtain that Eee E,;'E;,|A is the upper 
triangular matrix U given by 
111 
U=)]012 
002 


Thus, A = LU, where L = E},E}3E}, is the lower triangular matrix given by 


100 
L= 1110 
131 


Now, to find solution of Ax’ = [1, 2, 3]', we first find the solution of Ly’ = [1, 2, 3]. 
Equating the corresponding entries of both sides, y;) = 1,y; +y2 = 2, andy, + 
3y2 + y3 = 3. This gives the solution [1, 1, —1]‘ of Ly’ = [1, 2, 3]’. Finally, we find 


the solution of Ux’ = [1, 1, —1]' to get the solution of the original equation Ax’ = 
[1, 2, 3]’. Equating the entries of both sides in the equation Ux’ = [1, 1, —1]', we get 
that 2x3 = —1,x. + 2x3 = 1, andxy +x. +43 = 1.Evidently,x3 = Sm = 
2, andx,; = 3. 


2.6 Equivalence of Matrices, Normal Form 


Definition 2.6.1 Two m x n matrices A and B with entries in a field F are said to 
be equivalent if there exists a nonsingular m x m matrix P, and a nonsingular n x n 
matrix Q such thatA = PBQ. 


Clearly, the relation of being equivalent to is an equivalence relation on M,,,,(F). 
We determine a unique representative of each equivalence class of equivalent matri- 
ces. 


Definition 2.6.2 A m x n matrix A is said to be in normal form if there is r < 


min(m, n) such that 
I, O; n— 
A — T rn-r , 
On-r r On-r | 


where O,,, , denote the zero m x n matrix. 


Theorem 2.6.3 Everym x n matrix is equivalent to a unique matrix in normal form. 


2.6 Equivalence of Matrices, Normal Form 61 


Proof Applying an elementary row operation on a matrix A is equivalent to multiply 
A from left by an elementary matrix, and applying an elementary column operation is 
equivalent to multiply matrix A from right by an elementary matrix. Since all elemen- 
tary matrices are nonsingular, and product of nonsingular matrices are nonsingular, 
it is sufficient to show that every matrix can be reduced to a matrix in normal form 
with the help of elementary row, and elementary column operations. The proof of 
this fact is by the induction on max(m, n), where m is the number of rows and n the 
number of columns. If max(m,n) = 1,thenm = 1 = n,andA = [a,,]is 1x l 
matrix. If A = [0], then it is already in normal form. If aj; 4 0, then multiplying 
the row by Gas we reduce it to the normal form [1]. Assume that the result is true 
for all r x s matrices with max(r,s) < max(m,n).LetA = [aj] beam x n times 
matrix. If A = O, », then it is already in normal form, and there is nothing to do. 
Suppose that A 4 O,,, ,. Suppose that ay, A 0. Interchanging 1,, row and kth row, 
and then interchanging |,, column and the /th column, we may suppose that a,; 4 0, 
and then multiplying the 1,, row by ay) , we may further suppose that aj; = 1. After 
this we add —aj; times the first column to the jth column, and then —a;, times the 
first row to the ith row for alli 4 1 Aj. This reduces the matrix A into the form 


Qo On 
Om-1 1 B ; 


where B is m — 1 x n— | matrix. This also gives us a nonsingular m x m matrix C, 
and an x n nonsingular matrix D such that 


_ qT O71 n-1 
= pa 1 B iF 


By the induction hypothesis there is a m — 1 x m— 1 nonsingular matrix C’, and 
there is a nonsingular n — 1 x n — | matrix D’ such that 


| 
Take ; P 
clot Ce], 
and 


Ty Otn-1 
D" — n . 
bee 1 D' 


Then C” and D” are nonsingular. In fact, 


W\—1 q O71 n—1 
(C ) > Ee 1 | 


62 2 Matrices and Linear Equations 


(Use block multiplication to show this). Again, using block multiplication, we find 
that 


Ce . q O; n—1 . Dp! = q O71 n—-1 = I, O, n—r 
On-1 1 B On-1 1 C’BD' On-r r Om-r n—r 


Take P = C-C”,andQ = D-D". Then P is nonsingular m x m matrix, and Qa 
nonsingular n x n matrix such that 


m—rr Om-—r n—-r 


PAQ = ee O; n—r 


I, O, n-r 
On—r r On—r n-r 


I, Os n—S 
Om-—s Ss Om-—s n—S 


if and only if r = s, for one can be obtained from the other using elementary opera- 
tions if and only if r = s. ft 


is in normal form. Finally, 


is equivalent to 


Corollary 2.6.4 There are min(m,n) + 1 equivalence classes of equivalent matri- 
ces in Min (F). 


Proof There are min(m,n) + 1 matrices in M,,,(F) which are in normal form. 


Corollary 2.6.5. Two matrices A and B are equivalent if and only if they have same 
rank. 


Proof Since under elementary operations rank of the matrices do not change and 


rank of the matrix 
I r O, n—-r 
Om—r r Om—r n—-r 


is r, the result follows. tt 


Corollary 2.6.6 All nonsingular matrices in M,(F) are equivalent to I,. The group 
GL(n, F) is a single complete equivalence class of equivalent matrices. tt 


Proof Let A be an Xn matrix which is nonsingular. Then there are nonsingular 
matrices P and Q such that PAQ is in normal form. Clearly, then PAQ is also nonsin- 
gular. The result follows if we observe that a matrix in normal form is nonsingular 
if and only if it is the identity matrix. tt 


Corollary 2.6.7 The group GL(n, F) is generated by elementary matrices. Indeed, 
every element of GL(n, F) is product of elementary matrices. 


2.6 Equivalence of Matrices, Normal Form 63 


Proof All elementary matrices are nonsingular, and so they belong to GL(n, F). 
Further, given any matrix A € GL(n, F), there are nonsingular matrices P and Q 
which are product of elementary matrices such that PAQ = [,. But, then A = 
P~'Q7', Since inverse of an elementary matrix is an elementary matrix, P~' and 
Q~' are product of elementary matrices. This shows that A is product of elementary 
matrices. tt 


Remark 2.6.8 The matrices {E} |i #j, A € F*} do not generate GL(n, F) 
(verify). 


Remark 2.6.9 The proof of the Theorem 2.6.3 gives us a method by which 


(i) we can reduce a matrix A into normal form, 
(ii) we can find nonsingular matrices P and Q such that PAQ is in normal form, and 
(iii) we can determine whether A is nonsingular, and then we can find its inverse 
also. 


Following two examples illustrates the algorithm. 


Example 2.6.10 Let A be a m x n matrix. To find nonsingular matrices P and Q 
such that PAQ is in normal form, we proceed as follows: We start with a row with 
three columns. The first column J,,, the second A, and the third column J/,,. Then 
we try to reduce the matrix A in to normal form by successive elementary row and 
elementary column operations. Whenever we perform a row operation on A, apply 
the same operation to the matrix in the first column, and keep the matrix in the third 
column as it is, and if we perform a column operation on A, then we perform the 
same operation on the matrix in the third column, and keep the matrix in the first 
column as it is. Then as the matrix A reduces to a matrix in normal form, the matrix in 
the first column reduces to the required matrix P, and the matrix in the third column 
reduces to the required matrix Q. Consider, for example, the matrix 


111 
201 
110 
012 


Let R; denote the ith row, and C; denote the jth column. We start with a row 


1000 111 


0100 201 a 
0010 110 001 


0001 012 


Replacing R2 by Ry — 2R, and R3 by R3 — Rj, we transform the above row to the 
row 


64 2 Matrices and Linear Equations 


1 000 11 1 


—2100 0-2-1 ia 
—-1010 00 -1l 001 


Next, replacing Cz by Cy? — Cj, and C3 by C3 — Cj, we get the transformed row 


1 000 10 0 
—2100 0-2-1 
—-1010 00 -1l 


0 
0o001|lo1 2 : 


1 
0 
Interchanging R»2 and R4, and then replacing R4 by R4 + 2Ro, it reduces to 


1 000 10 0 


0001] /01 2 ata 
—1010 00-1 00 1 
—2102 00 3 

Replacing C3 by C3 — 2C2, we transform it to 
1 000 10 0 l-11 
0001 01 0 01-2 
—-1010 00-1 00 1 


1000 100 iat i 
0001 010 01-2 
1 0-10 001 00 1 
—513 2 000 


Thus, A reduces to the normal form 


lois} 


Further, the required nonsingular matrices P and Q are given by 


10 
00 
Sy 
1 


=) 


2.6 Equivalence of Matrices, Normal Form 65 


and 
1-1 1 
Q=;,01 -2 
00 1 


2.7 Congruent Reduction of Symmetric Matrices 


Definition 2.7.1 A square matrix A is said to be congruent to a matrix B if there is 
an invertible matrix P such that PAP’ = B. 


Observe that if A is symmetric, then PAP’ is also symmetric. 


Theorem 2.7.2 Every symmetric matrix A with entries in a field F of characteristic 
different from 2 is congruent to a diagonal matrix. 


Proof The proof is algorithmic. Let us recall that applying an elementary row oper- 
ation on A is equivalent to multiply from left the corresponding elementary matrix 
E, and applying the same type of elementary column operation on A is equivalent to 
multiply the matrix A from right by the elementary matrix E" (note that if we apply an 
elementary row operation on the identity matrix and take its transpose, then it is the 
same as apply the same elementary column operation on the identity matrix). Thus, 
it is sufficient to show that a symmetric matrix with entries in a field F of character- 
istic different from 2 can be reduced to a diagonal matrix by applying successively 
elementary row followed by the same type of elementary column operations. Let A 
be a symmetric matrix with entries in F, where characteristic of F if different from 
2.IfA = 0, then there is nothing to do. Suppose that A 4 0. We may suppose that 
a, # 0, for if not, suppose that aj; = aj; # 0, then adding the ith row to the first 
row, and then adding the ith column to the first column the first row first column entry 
becomes 2a;; € 0 (note that the characteristic F ~¢ 2). Then, for each i ¢ 1, adding 
—aa;; times the first row to the ith row, and aya; times the first column to the 
ith column, we reduce the matrix to a symmetric matrix matrix in which all entries in 


the first row (and so also in the first column) except a1; is 0. Now, if aj = 0 for all 
i,j => 2, we have reduced it to a diagonal matrix. If not, using the previous argument, 
we may take dz 4 0, and then for i 4 2 reduce all the entries aj, = az; = 0. 
Proceeding inductively we reduce the matrix A to a diagonal matrix. ft 


Taking Q = P™', we get the following corollary. 


Corollary 2.7.3. Every symmetric matrix A with entries in a field of characteristic 
different from 2 can be decomposed as A = QDQ', where Q is an invertible matrix, 
and D is a diagonal matrix. tt 


Remark 2.7.4 The theorem does not hold over a field of characteristic 2. Consider 


the matrix 
01 
10]° 


66 2 Matrices and Linear Equations 


ab||}0O1l|]ac} _|po0 

cd 10} |bd|~ |0q}]" 
Equating the corresponding entries p = ba+ ab, q = dc+cd, da+cb=0 = 
bc + ad. Since the field is of the characteristic 2, p = 0 = q. In turn, 


[ea] [vo] [5a] = [oo] 
[oe] 


We illustrate the algorithm of congruent reduction by means of an example. 


Suppose that 


But, then 


is singular. 


Example 2.7.5 Let A be a symmetric n x n matrix. To find a nonsingular matrix P 
such that P'AP is a diagonal matrix, we proceed as follows: We start with a row 
with 3 columns, the first column J,,, the second column A, and the third column J,. 
We reduce the matrix A in to a diagonal form by successive elementary row and 
corresponding elementary column operations as described in the above theorem. 
Whenever we apply an elementary row operation on A, we apply the same operation 
on the matrix in the first column, and keep the matrix in third column as it is, and 
whenever we apply elementary column operation we apply the same operation on 
the matrix in the third column, and keep the first column as it is. In this process as 
soon as A reduces to a diagonal matrix, the first column reduces to P, and the third 
column, then will be P’. Further, PAP’ is a diagonal matrix. Consider, for example, 
the matrix 


012 
A= |101 
210 


and the triple 
[BAL | 


If we apply the following elementary operations 
1.R; —> Ri + Ro, 

2» C\ — Ci + C2, 

3. Ry —> Ro — 4Ri, 

4.Cy — C2 - $Ci, 

5. R3 —> R3— 3R), 

6. C3 —> C3— 3C), 


2.7 Congruent Reduction of Symmetric Matrices 67 


7. R; =z R; = Ro, 
8. C3 =z C3 = C2, 
successively, on the triple 
(3A), 


then the triple of matrices reduce to the triple 


Further, take L = P~! andD = diag (2, —5, —4),thenA = LDL’. Note that L is 


not a lower triangular matrix. However, if we consider the matrix 


112 
A=1|101 
210 


with the triple 
(3 AL) 


of matrices and apply the following elementary operations on each member of the 
triple to reduce A to a diagonal matrix. 

1. R> — R> —R, and R3 —- R3 = 2R), 

2. Cr — Cr = C\ and C3 — C3 = 2C\, 

3. R; —> R; = Ro, 

4, C3 — C3 — C2. 
Then the triple of matrices reduce to the triple 


1 00 10 0 1-1-1 
-1 10 0-1 0 01 -tl1 
-1-11 0 0 -3 00 1 


Thus, A is congruent to diag(1, —1, —3) and P is the matrix 


1 0 —-0 
-1 1 0 
-1-1 1 


68 2 Matrices and Linear Equations 


Further, take L = P~! andD = diag (2, —5, —4), then A = LDL’. Note that in 


this case P and L are lower triangular matrices. 


Example 2.7.6 Consider the symmetric matrix 


3 0-1 
A= 010 
—-10 3 
with the triple 
(13 A 13) 


of matrices and apply the following elementary operations on each member of the 
triple to reduce A to a diagonal matrix. 

1.R3 — R3 + sR, and 

2.C3 —> C3 + 4C). 
Then the triple of matrices reduce to the triple 


100] [300] [103 
010]]010]}010 
F011] Loo§}]o001 


Here again, P is a lower triangular matrix and the diagonal matrix D has all diagonal 
entries positive. As such, if we take L = P-1!./D, where ./D = Diag (V3, 1, 8, 


then A = LL’. Later we shall describe those symmetric matrices which can be 
expressed as LL', where L is a lower triangular matrix. 


Exercises 


2.7.1 Give two bases of the vector space Mnm(F)) of n x m matrices with entries in 
a field F over the field F. 


2.7.2 Find a basis, and so also the dimension of the vector space S,(F) of n x n 
symmetric matrices with entries in a field F. 


2.7.3 Let F be a field of characteristic different from 2. Find a basis, and so also 
the dimension of the vector space SS,,(F’) of n x n skew symmetric matrices with 
entries in a field F. Do the same for fields of characteristic 2. Are they same? 


2.7.4 Let A be an x m matrix. Consider the subset W = {B € Mn | AB = Onp} 
of Miy. Show that W is a subspace of M,,,. Further, show that the dimension of W 
is pn(A), where n(A) denotes the nullity of A. 


2.7.5 Show that every square matrix A with entries in a field F of characteristic 
different from 2 is uniquely expressible as sum of a symmetric matrix, and a skew 
symmetric matrix. Deduce that vector space M,,(F) is direct S,(F') ® SS,(F). 


Hint,A = 4¢4 4 4-4 


2.7 Congruent Reduction of Symmetric Matrices 69 


2.7.6 Find a basis, and so also the dimension of the vector space UT,,(F’) of upper 
triangular matrices over F’. 


2.7.7 The sum of the diagonal entries of a square matrix A is called the Trace of 
A, and it is denoted by 7r(A). Let sl(n, F) denote the set of n x n matrices with 
trace 0. Show that s/(n, F’) is a vector space with respect the addition of matrices and 
multiplication by scalars. Find a basis of s/(n, F’), and so also its dimension. 


2.7.8 Let A and B be square n x n matrices. Show that Tr(AB — BA) = 0. Deduce 
that AB — BA is never identity matrix. Show by means of an example that it may be 
a nonsingular diagonal matrix. 


2.7.9 Show by means of an example that AA’ need not be same as A‘A. 


2.7.10 Consider the co-diagonaln x nmatrixT’, = [aj], where aj lifi+j = 
n+1, anda; = 0, otherwise. Show that I, is symmetric and i = I[,. What is 
the matrix [’,,AT,. 


2.7.11 Describe all 2 x 2 matrices A such that A? = 05. 
2.7.12, Let A be a strictly upper (lower) triangularn x n matrix. Show thatA” = 0,. 


2.7.13, Let A be a square n x n matrix which is nilpotent in the sense thatA” = 0, 
for some m. Show that J, + A is invertible. Show that 


Ip + At A? +--+ A") 


is the inverse of A. Is the converse of this statement true? Support. 


2.7.14 Let A = [aj] be a square n x n matrix which commutes with e2. Show 
that aj2 = 0 = ay), and ay; = a. Show that a matrix commutes with all e, if 
and only if it is a scalar matrix. Show also that the matrices which commute with all 
transvections are precisely scalar matrices. Deduce that the center Z(GL(n, F)) is 
precisely {al,, | ae F*}. 


2.7.15 Find a basis, and so also the dimension of the subspaces of R* generated by 


the following subsets: 


Gq) {(1, 0, 2, 1), @, 1,3, 2), (7,4, 9,5), (1,5, 6, D}, 


(i) {(1, 1, 1, 1), C1, 0, 2, 3), 1, 0, 4, 9), C1, 0, 8, 27)}. 


2.7.16 Reduce the following matrices in to reduced row echelon form. Find the 
bases of their row spaces, column spaces, and Null spaces. Find their rank, and the 
nullities. Further, for each of the matrices A, find an invertible matrix P such that PA 
is a reduced row echelon form of A. 


70 2 Matrices and Linear Equations 


003 —3 —3 1111 12 3 4 
243 3 1 012 3 24 7 Il 
243 3 3 |’ 1 0-10)]’ | 3 7 1425 
122 1 2 —-513 2 4 11 25 50 

|; 22 3.4 

5 6 7 8 

9 10 11 12 

13 14 15 16 


2.7.17 Check if the following systems of linear equations are consistent, and if so 
find their general solutions. 


1. 
xy + 3x. + 43 = 1. 


2x, — y+ = 
4x) + x2 — x3 


2. 
0. 
8x) — 3x. + %3 3. 


Xy + 2x. + x3 + 2x4 + x5 = 2. 
2x, + 4x0 + 3x3 + 3x4 + 5 
2x, + 4x. + 4x3 + 2x4 + 2X5 
Xy + 2x. + 2x3 + x4 + 2x5 = 2. 


= 8: 
= 8. 


Ax, 15x2 2x3 32x, = —40. 
Ay = 2x2 a 3x4 = —4., 
—3x,; + 16x. + 3x3 + 38x4 = 46. 
xy 6x2 = Xo 14x4 = -17. 


2.7.18 Find the value of a, if possible, for which the following system of linear 
equations is consistent. 


4x, 15x. 2x3 32Xx4 = —40. 


PS io 2x9 = 3x4 = —4, 
—3x, + 16x. + 3x3 + 38x4 = 46. 
xy 6x2 X32 14x4 = a. 


2.7.19 Check if the matrices in exercise 16 have LU decompositions and if so find 
their LU decompositions. 


2.7.20 Express each of the following symmetric matrices as PDP’, where P is a 
nonsingular matrix, and D a diagonal matrix. Which of the matrices are expressible 
as LDL', where L is a lower triangular matrix. Also express them, if possible, as LL’, 
where L is a lower triangular matrix. 


2.7 Congruent Reduction of Symmetric Matrices 71 


101 ee: 012 
1123 

eed 1213]? 

113 1132 210 


123 4 
267 8 
3711 12 
48 12 16 


2.7.21 Find the maximum number of arithmetic operations needed to reduce a3 x 3 
matrix into reduced row echelon form. Generalize it ton x n matrices. 


2.7.22 Write a program in C-Language to check if a system of linear equations is 
consistent, and if so to find a general solution. 


2.7.23 Write a program in C-Language to check if a matrix A admits LU decom- 
position, and if so to find it. 


2.7.24 Write a program in C-Language to check if a symmetric matrix A admits LL’ 
decomposition, and if so to find it. 


Chapter 3 
Linear Transformations 


This chapter centers around the study of linear transformations and their matrix 
representations. 


3.1 Definition and Examples 


Definition 3.1.1 Let V, and V2 be vector spaces over a field F. A map T from V; 
to V> is called a linear transformation or a homomorphism if 


T(ax + by) = aT(x) + bT(y) 


foralla,be F,andx,yeéV,. 
A bijective linear transformation is called an isomorphism. 


Proposition 3.1.2 Let T be a linear transformation from a vector space V, to a 
vector space V>. Then the following hold: 


(i) T(O) = Oand 
(ii) T(—x) = —T(x) forall x € Vj. 
(iii) If W, is a subspace of V,, then T (W,) is a subspace of V3. 
(iv) If W. is a subspace of V2, then the inverse image T~'(W2) of W under T is a 


subspace of V,. 
Proof (i) Since T is a linear transformation, T(0) = T(O-x + O0-y) = 
0-T(ix) + 0-TQO) = 0. 
(ii) T(—x) = T(-1-x + 0-x) = (-1)-T(x) + 0-T(x*) = -T(x). 


(iii) Let W; be a subspace of V;. Then 0 € W,, andsoO = T(O) € W5. Let 
T(x), T(y) € T(W), where x,y € W,. Since W, is a_ subspace, 


© Springer Nature Singapore Pte Ltd. 2017 73 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_3 


74 3 Linear Transformations 


ax + by € W, forall a,b € F. It follows thataT(x) + bT(y) = T(ax + by) 
belongs to T(W,). This shows that T(W,) is a subspace of V2. 

(iv) Since O = T(0) € Wy, it follows that 0 € T~'(W2). Let x, y € T~'(W2). Then 
T(x), T(y) € Wo. Since W2 is a subspace, T(ax + by) = aT(x)+bT(y) € 
W3 for all a, b € F. It follows that ax + by € T~!(W2). This shows that T~!(W2) 
is a subspace of W}. tt 


The inverse image T~!({0}) of the trivial subspace {0} under a linear transforma- 
tion T is a subspace, called the null space of 7, and it is denoted by N(T). N(T) is 
also called the kernel of T, and then it is denoted by ker T. 


Proposition 3.1.3 A linear transformation T is injective if and only if N(T) = {0}. 


Proof Suppose that T is injective, and x € N(T). Then T(x) = 0 = T(O). Since 
T is assumed to be injective, x = 0. Hence N(T) = {0}. Suppose, conversely, 
that VN(T) = {0}. Suppose that T(x) = T(y). Since T is a linear transformation, 
T(x—y) = T(x)—T(y) = 0. Hencex —y € N(T) = {0}, andsox = y. 


Thus, 7 is injective. ft 
Example 3.1.4 Leta = (a), a2, a3) be a vector in the Euclidean vector space R3 
over R. Define a map T from R? to itself by T(7) = 7 x G, where x is the vector 


product in R?. Then T is a linear transformation (follows from the property of vector 
product). The null space of T is givenby N(T) = {x | xxa = 0} = {aa | aeR} 
provided that @ 4 0. What is the image of T ? 


Example 3.1.5 Let F be a field. Let V = F" and W = _ F" be standard 
vector spaces over the field F. Let T be a linear transformation from F” to 
F™, Let {e7, @2,--- ,@,} denote the standard basis of F”. Let T(@é;)) = TF = 
(dj1, 4j2, +++ , @im). Then T determines a matrix A = M(T) whose i; row is 77. It 
is evident that T(x) = xM(T), where x is treated as a 1 x n matrix. Thus, a linear 
transformation from F” to F” is precisely multiplication by an x m matrix from 
right (note that elements of F” are considered as | x n matrices). The null space 
N(T) is precisely the null space N(M(T)) of the corresponding matrix M(T). 


Example 3.1.6 Let V be a vector space over a field F, and W a subspace of V. 
Consider the quotient space V/W. The quotient map v from V to V/W given by 
v(x) = x + W isa linear transformation (follows from the definition of operations 
on V/W). Clearly, v is surjective, and the null space N(v) is givenby N(v) = {x € 
V|x+W = v(x) = W} (note that the zero of V/W is the coset W). Since 
x+W = Wiif and only if x € W, it follows that VN(v) = W. In turn, it also 
follows that every subspace of a vector space is null space of a linear transformation. 


Example 3.1.7 Let 0, denote the vector space of polynomials over the field R of 
real numbers of degrees at most n. Let D denote the derivative. Thus, 


Diag ark +X + ha) = a, 2k 3k eo ena x, 


Then D is a linear transformation (verify). The null space of D is space of constant 
polynomials. Find its rank and nullity. Note that D is nilpotent. Indeed, D”*! is the 


3.1 Definition and Examples 75 


zero linear transformation. Further, J + D is an isomorphism. In fact, J — D+ D? — 
D? +.--+(—1)"D" is the inverse of J + D (verify). 


Example 3.1.8 Let C°°(IR) denote the vector space of real-valued functions on R 
which are r-times continuously differentiable functions for all r. The Differential 
operator D? — 3D + 2 from C™(R) to itself given by 


d* f (X) 32fX) 


2 = 
(DY — 3D + 2FQ) = — a ax 


4 2 FCN). 


is a linear transformation. The null space of this differential operator is precisely 
{ae* + Be?* | a, B € R} which is of dimension 2. 


3.2 Isomorphism Theorems 


Theorem 3.2.1 (Fundamental Theorem of Homomorphism). Let T be a linear 
transformation from a vector space V over a field F to a vector space V' over 
the same field F. Let W be a subspace of V. Then there exists a linear transforma- 
tion T from V/W to V' such that Tov = T if and only if W © N(T). Also if such 
a linear transformation exists, it is unique. Further, then T will be injective if and 
only if W = N(T). Finally, T is an isomorphism if and only if T is surjective and 
W = N(T). 


Proof Suppose that there is a linear transformation T from V/W to V’ such that 
Tov = T.Letx € W.Thenx + W = W the zero of V/W. Now, 


T(x) = T(x) = T@+W) = T(W) = 0, 


for T is a linear transformation. Thus, x € N(T). This shows that W C N(T). 
Conversely, suppose that W C N(T) andx+W = y+W.Thenx—y € W. Since 
W C N(T), T(x — y) = O. Since T is a linear transformation, T(x) = T(y). 
Thus, we have a map T from V/W to V’ defined by T(x + W) = T(x). It is easily 
observed that T is a linear transformation such that Tov = T. Further, if T’ is 
a linear transformation such that T’ov = T. Then, T’(x+ W) = T’'(v(x)) = 
(T’ov)(x) = T(x) = T(x + W). This shows that T’ = T. 

Next, suppose that such a T exists, and it is injective. Then already W C N(T). 
Let x € N(T). Then T(x + W) = T(x) = 0 = T(W). Since T is supposed to 
be injective, x + W = W,andsox € W. This shows thatW = N(T). 

Conversely, suppose that N(T) = W. Suppose that T(x + W) = T(y+W). 
Then T(x) = T(y).Hencex—y € N(T) = W. This means thatx+W = y+W. 
This shows that T is injective. Finally, since Tov = T, T is surjective if and only 
if T is surjective. tt 


Corollary 3.2.2 Let T from V to V' be a surjective linear transformation. Then 
VIN) XV". tt 


76 3 Linear Transformations 


Theorem 3.2.3 (Noether Isomorphism Theorem). Let V; and V2 be vector sub- 
spaces of a vector space V over a field F. Then V,; + V2/V2 is isomorphic to 


Proof Define a map n from V; to V; + V2/V2 by n(x) = x + Vp». Clearly, 7 is a 
linear transformation. Any element of V; + V2/ V2 is of the form (x + y) + V2, where 
x € V, and y € V2. But, then (x + y) + Vo = x + V2. = n(x). This shows that 7 
is surjective. Further, 


N(n) = {xe Vi |x+V. = Vo} = {xe Vi | xe Va} = Vif) Vd. 


The result follows from the fundamental theorem of homomorphism. tt 


Proposition 3.2.4 Let V and V’ be vector spaces over a field F. Then, 


(i) alinear transformation T from V to V' is surjective if and only if it takes a set 
of generators to a set of generators. 
(ii) a linear transformation T from V to V’ is injective if and only if it takes a 
linearly independent set to a linearly independent set. 
(iii) alinear transformation T from V to V' is an isomorphism if and only if it takes 
a basis to a basis. 


Proof (i) Let T be a linear transformation. Since T(< S >) is a subspace containing 
T(S), it follows that < T(S) >C T(< S >). Further, since image of linear com- 
bination of members S is a linear combination of members of T(S), it follows that 
T(< S >) C< T(S) >. Thus < T(S) > = T(< S >). The result follows. 

(ii) Let T be an injective linear transformation from V to V’. Let S be a linearly 
independent subset of V. Let y) = T(x), y2 = T(%),...,y, = T(x;) be 
distinct elements of T(.S). Since T is injective, x,, x2,..., x, are distinct elements 
of S. Suppose that 


oT (x1) + a2T (x2) +--+ 0,T(x,) = 0. 
Since T is a linear transformation, 
T (ax; + QoxX. +--+ +a,x,) = 0. 
Since T is injective, 
1X; + 2X2 +++: + a,x, = 0. 

Since S is linearly independent, a; = 0 for all i. This shows that T(S) is linearly 
independent. Conversely, suppose that T takes a linearly independent subset to a 
linearly independent subset. Let x € V,x 4 0. Then {x} is linearly independent, 


and so {T (x)} is linearly independent. Thus, T (x) 4 0. This shows that T is injective. 
(iii). Follows from (i) and (ii). tt 


3.2 Isomorphism Theorems 77 


Corollary 3.2.5 Let V be a finite dimensional vector space over a field F, and 


S = {x1,X2,...,x,-} an ordered basis of V. Then a linear transformation T from V 
to W is an isomorphism if and only if {T (x1), T (x2), ..., T(x,)} is an ordered basis 
of W. t 


Proposition 3.2.6 Let V and W be vector spaces over a field F. Let S be a basis of 
V. Then any map f from S to W has a unique extension to a linear transformation Ts 
from V to W. More precisely, we have a bijective map ny from the set Homr(V, W) 
of all linear transformations from V to W to the set Map(S, W) of all maps from S 
to W givenbyn(T) = T/S. 


Proof Let f be a map from S to W. Since S is a basis of V, every nonzero element 
x € V has a unique representation as 


X = xX, + AQX2 + +++ + A,X;,, 


where x, X2,..., X, are distinct elements of S, and all a@; are nonzero. Thus, we have 
a map T defined by 


T (ax, + 2x2 + +++ + a,x) = a f(%1) + a2 f (x2) +--+ + a, f(%,). 


Clearly, T extends f, and it is a linear transformation. If T’ is also a linear transfor- 
mation which extends f, then 


T (ax, te + O/X-) = a T' (x1) See ea ale wT Oe) = 
ay f (x1) ap etet a, f (x-) = T (ayx1 oc a,X;). 


Hence T = T’. tt 


Remark 3.2.7 The above proposition says that two linear transformations are same 
if and only if they agree on a basis. 


Corollary 3.2.8 Any two finite dimensional vector spaces are isomorphic if and only 
if they are of same dimension. In particular, any n dimensional vector space over F 
is isomorphic to the standard vector space F”. 


Proof Suppose that V and W are isomorphic, and T is an isomorphism from V to 
W. Then T, by the Corollary 3.2.5, takes an ordered basis to an ordered basis. Hence 
dimV = dimW. Conversely, suppose that dimV = dimW. Then there is a 
bijective map from a basis of V to a basis of W which can be extended to a linear 
transformation taking a basis to a basis. Thus, this extended linear transformation is 
an isomorphism. ft 


Corollary 3.2.9 Let V and W be vector spaces of dimensions n and m, respectively, 
over a field F containing q elements. Then the number of linear transformations 
from V to W is (q™)". 


78 3 Linear Transformations 


Proof Since dimension of W is m, and F contains gq elements, W contains g” 
elements. A basis of V contains n elements. It follows from above results that there 
are as many linear transformation from V to W as many maps from a basis of V to 
W. tt 


Proposition 3.2.10 Let V and W be finite-dimensional vector spaces of same dimen- 
sion (in particular V may be same as W). Let T be a linear transformation from V 
to W. Then, 

(i) T is an isomorphism if and only if it is injective. 

(ii) T is an isomorphism if and only if it is surjective. 


Proof (i) If T is an isomorphism, then it is bijective, and so it is injective also. Con- 


versely, suppose that T is injective. Let {x1, x2, ..., X,} bean ordered basis of V. Then 
itis linearly independent also. By the Proposition 3.2.4 (ii), {T (x1), T (x2), ..-, T(Xn)} 
is an ordered linearly independent subset of W. Sincen = DimV = DimW, 


{T (x1), T (x2), ..., T (%»)} is a basis of W. By the Proposition 3.2.4 (iii), it follows 
that T is an isomorphism. 
(ii) Again, if T is an isomorphism, then it is bijective, and so it is surjec- 


tive. Suppose that T is surjective, and {x,, x2,...,X,} 1s an ordered basis. Since 
T is surjective, {7 (x1), T(x2),..., T(X,)} is an ordered set of generators. Since 
n = DimV = DimW, {T(x1), T(x2),...,7(Xn)} is an ordered basis. By the 
Proposition 3.2.4(iii), T is an isomorphism. tt 


Let V be a vector space over a field F’. An isomorphism from V to itself is called 
an automorphism of V. The set of all automorphisms of V is denoted by GL(V). 
Some times it is also denoted by Aut(V). GL(V) is a group with respect to the 
composition of maps. This group is called the general linear group on V. 


Proposition 3.2.11 Let V be a vector space of dimension n over a field F, and B(V ) 
the set of all ordered bases of V. Let {x,,X2,..., Xn} be a fixed member of B(V). 
Then we have a bijective map n from GL(V) to B(V) defined by 


nT) = {T (x1), T(%2),---, T Qn}. 


Proof Since an isomorphism takes an ordered basis to an ordered basis, 7 is indeed a 
map from GL(V) to B(V). Since a linear transformation is uniquely determined by its 


effect on an ordered basis, 7 is injective. Also given an ordered basis {y1, y2,.--, Yn} 
of V, the map x1 ~» y1, x2 ~ y2,...,Xn ~% Y, can be extended to automorphism T 
of V such that n(T) = {y1, y2,..-, Yn}. tt 


The following corollary follows from the above proposition and the Proposition 
1.4.19. 


Corollary 3.2.12 Let V be a vector space of dimension n over a field F containing 
q elements. Then the group GL(V) is finite of order 


(q" — 1)(q" —q)---(q" —q""'). i 


3.3. Space of Linear Transformations, Dual Spaces 79 


3.3. Space of Linear Transformations, Dual Spaces 


Let V and W be vector spaces over a field F. Let Homr(V, W) denote the set of 
all linear transformations from V to W. Let f, g © Homr(V, W). Define f + g by 
(f+g)(x%) = f(x) + g(x). It can be easily verified that f + g © Homr(V, W). 
This defines an operation + on Hom, (V, W). It is easily seen that Hom;(V, W) is 
an abelian group with respect to the addition +. Let f ¢ Homr(V, W), anda e€ F. 
We define af by (af)(x) = a- f(x). Then, af also belongs to Homr(V, W). 
This defines a multiplication by scalars on Hompr(V, W). Indeed, Homr(V, W) is 
a vector space over F under these operations (verify). In particular, End(V) is also a 
vector space over F’. In fact, End(V) is an algebra, the internal multiplication being 
the composition of maps. 


Theorem 3.3.1 Let V and W be finite-dimensional vector spaces over a field F. 
Then 


dimHomrf(V, W) = dimV -dimwW. 


Proof Suppose thatdimV = nanddimW = m.Let{x,, x2,...,xX,}bean ordered 
basis of V, and {y), yo,..., Ym} an ordered basis of W. Fix a pair (i, j), 1 <i < 
n, 1 < j <m.Since every map from a basis of a vector space to a vector space can be 
extended uniquely to a linear transformation, we have a unique 7;; € Hompr(V, W) 
whose restriction to the basis {x),x2,..., Xn} 1s given by 7;;(x;) = yj, and for 
k Ai, T;j(x,) = 0. We show that B = {7;;|1<i<n,1< j < m} isa basis of 
Homf(V, W). Let T € Homp(V, W). Then T(x;) € W. Since {y1, y2,... Vn} isa 
basis of W, 


T (xi) = Ui aj yj 
foruniquea;; ¢ F, 1 < j <m, 1 <i <n. It follows that the linear transformations 
T and &; ;a;;T;; agree on each x;, and so they agree on a basis. This means that 


T = &; ;a;;7T;;. Thus, B is a set of generators for Hom (V, W). 
Next, we show that B is linearly independent. Suppose that 


Then 
(2), j:T;j)x) = O forall k. 
Hence, 


&, ja ji Tij (xe) = O forall k. 


80 3 Linear Transformations 
Thus, 
wit a jey; = Oforallk. 


Since {y], y2,---, Ym} is linearly independent, aj, = O forall j,k. This shows that 
B is linearly independent, and so it is a basis of Homr(V, W). Further, it follows 
that 


dimHomr(V,W) = n-m = dimV -dimwW. tt 


Definition 3.3.2 Let V be a vector space over a field F. Treat F as a vector space 
over F'. The members of Homr(V, F) are called the linear functionals on V. The 
vector space Homr(V, F), denoted by V*, is called the dual space of V. 


If V is finite dimensional, then dimV* = dimHom(V, F) = dimV-dimF = 
dimV. Thus V and V* have same dimensions, and so they are isomorphic as vector 
spaces. This is not true for infinite-dimensional spaces. For example, the vector space 
R of real numbers over Q is of infinite dimension, and it has a basis whose cardinality 
is the same as the cardinality of R. Thus, the cardinality of IR* is the same as that of 
the set Q® of all maps from R to Q. Clearly, there is no bijective map from R to Q®, 
and so R and R* are not isomorphic as vector spaces over Q. 


Definition 3.3.3. Let V and W be vector spaces over a field F. Let T be a linear 
transformation from V to W. Define a map T' from W* to V* by T'(f) = foT. 
Then 7" is a linear transformation(verify), and it is called the Transpose of T. 


Proposition 3.3.4 Let V,, V2 and V3 be vector spaces over a field F. Let T, : Vv, —> 
V2 and T, : V1 —~> V3 be linear transformations. Then, 


(holy = Tok, 
Proof (T,0T,)' is a linear transformation from V¥ to V;* given by 


(Th0T))'(f) = fo(T,0T) = (foTr)oT, = Tz(f)oT = Ti(Ty(f)) = 
(T{0T;)(f) 
for all f € V3. tt. 


Let V be a vector space over a field F. The dual (V*)* of V™* is called the 
double dual of V, and it is denoted by V**. Let x € V. Define a map x** from 
V* to F by 


x"(f) = f). 


It can be checked easily that x** is a linear functional on V*. Thus x** € V**. This 
gives us amap x ~ x™ from V to V™. 


3.3. Space of Linear Transformations, Dual Spaces 81 


Proposition 3.3.5 Let V be a vector space over a field F. Then the map x ~» x** 
from V to V™ is an injective homomorphism. If V is finite dimensional, then it is 
also an isomorphism. 


Proof 


(ax + By)"(f) = flax + By) = af) + BO) = ax™(f) + By" (f) 


forall f € V*. Thus, (ax+6By)* = ax**+ By, and so the map x ~ x*™ isa linear 
transformation. To show that this is injective, it is sufficient to show that x** = 0 
implies that x = 0. Suppose that x # 0. Then {x}, being linearly independent, can 
be enlarged to a basis of V. We have a map from this basis to F which is | on x and 
zero at all other members of the basis. This can be extended to a linear functional f 
of V which is 1 at x. Thus, x*(f) = f(x) = 140. 

Finally, if V is finite dimensional, then DimV = DimV* = DimV™*. Hence 
any injective linear transformation, in particular x ~» x**, is an isomorphism. tt 


Remark 3.3.6 A vector space is said to be reflexive if the map x ~» x™ is an 
isomorphism from V to V**. Thus, every finite-dimensional vector space is reflexive. 
Clearly, the vector space R over Q is not reflexive. 


We know that every finite-dimensional vector space is isomorphic to its dual(being 
of same dimension). The question is whether we have a natural isomorphism. More 
precisely, do we have isomorphisms fy from V to V* for all vector spaces V such 
that given a linear transformation T from V to W the following diagram commutes. 


fv 


V _ Ve 


The answer to this question is in negative. Suppose that we have a family, 
{fv : V — V*| Vis a vector space over F}. 


of isomorphisms. Let V be a one-dimensional vector space with {x} as a basis, 
x # 0. Define a linear functional x* € V* by x*(ax) = a. Then x* 4 0. Since 
V* is also one dimensional, {x*} is a basis of V*. Hence there is a A € F such that 
fv(x) = Ax*. Take ay € F such that yu? 4 1. Define a linear transformation T on 
V by T(x) = px. Then, 


82 3 Linear Transformations 
(T'ofyoT)(x) = T'(fv(T(x))) = T'Apx*) = Apx*o®. 
Also, fy(x) = Ax*. Now 
(Apx*oT)(x) = Apx*(T(x)) = Apx*(ux) = Awrx*(x) = AW? 
and fy(x)(x) = Ax*(x) = A. Since we #1, fv # T'ofvoT. Thus, the above 
diagram is not commutative. 


However, the following result says that every finite dimensional-vector space is 
naturally isomorphic to its double dual. 


Proposition 3.3.7 The family 
{fv : V —> V™| Vis finite dimensional, fy(x) = x™*} 
defines natural isomorphisms from finite-dimensional vector spaces to its double 


duals in the sense that given any linear transformation T : V —> W, the following 
diagram is commutative. 


Proof We have already seen that fy defined above is an isomorphism. Let x € V. 
Then 


(fwoT)(x) = fw) = T@)™. 
Also, 
(PYofp@) = (TV a") = x" or, 
Now, 
T(x)"(g) = g(T@)) 
for all x € V. Further, 


(x™oT')(g) = x™(T'(g)) = x*(goT) = (goT)(x) = g(T(x)) 


3.3. Space of Linear Transformations, Dual Spaces 83 


for all x € V. This shows that T(x)** = x**oT' for all x € V. Hence fwoT = 
(T')'ofy. t 


Definition 3.3.8 Let V be a vector space over a field F. Let {e), e2,..., en} bea 
basis of V. Consider F as a vector space over F with {1} as a basis. Fixi, 1 <i <n. 
Define a linear functional e7 on V by 


re _jlifjsi 
eG) 0 otherwise 
Then as in Theorem 3.3.1, {e}, e5,..., e*} is a basis of V* called a dual basis which 
is dual to {e), €2,..., en}. 
Let V and W be vector spaces over a field F’. Let {x1, x2, ...,X,} bea basis of V, 


and {y1, y2,---, Ym} be a basis of W. Let {x}, x3,..., x7} and {yf, y3,..., 7} be 
corresponding dual bases. Let T be a linear transformation from V to W. Suppose 
that 


T(x) => wit ji yj- 


Now, T'(yf) = ygoT, and 

(yfoT ) (xi) = yf (Ti) = ye" a jiy;) 
=U iV VA) = Oki = ix} (Xi). 

Thus, 


T'(yg) = Diy Mei}. 


This gives us an expression for 7’ in terms of dual bases provided we know the 
expression for T in terms of the given bases. 


3.4 Rank and Nullity 


Definition 3.4.1 Let V and W be vector spaces of finite dimensions over a field F. 
Let T be a linear transformation from V to W. The dimension of the image T(V) 
is called the rank of 7, and it is denoted by r(T). The dimension of the null space 
N(T) of T is called the nullity of 7, and it is denoted by n(T). 


Thus, T is injective if and only ifn(T) = 0, and T is surjective if and only if 
r(T) = dim(wW). 


Theorem 3.4.2 Let V and W be vector spaces of finite dimensions over a field F. 
Let T be a linear transformation from V to W. Then, 


r(T) + n(T) = dim(V). 


84 3 Linear Transformations 


Proof From the fundamental theorem of homomorphism 7(V) is isomorphic to 
V/N(T). Also dim(V/N(T)) = dim(V) — dimN(T). Hence 


r(T) = dim(T(V)) =dimV/N(T) = dimV — dim(N(T)) = 
dimV — n(T). q 
Proposition 3.4.3 LetT; : Vj; —> V2 and Ty : V1 —> V3 be linear transformations 
between finite-dimensional spaces over a field F. Then 
r(Th0T,) < min(r(Tn), r(1})). 
Proof Since T>(T;(V,)) is a subspace of 7>(V2), 
r(Tj0T}) = dim(Tn(T\(Vi))) < dimT2(V2) = r(T). 
Next, it follows from the above proposition that 
dimT,(T\(V\)) = dimT(V\) — n(Ip/T\(Vi)) < dimT\(Vi) = r(T). 


Corollary 3.4.4 Under the hypothesis of the above proposition, if T, is an isomor- 
phism, thenr(T,0T,) = r(12), and if Ty is an isomorphism, thenr(T,0T,) = r(T;). 


Proof Suppose that 7, is an isomorphism. Then from the previous proposition, it 
follows that 


r(T)) = (ToT oT, ') < r(T,0T;) < r(T). 


Thus, r(7>) = r(T,oT,). The rest of the assertion follows similarly. tt 


Corollary 3.4.5 Let T : V —> W be a linear transformation between finite- 
dimensional vector spaces over a field F. Then 


r(T) = r((T')'). 


Proof From Proposition 3.3.7, we have fwoT = (T‘)‘ofy. Since fy and fw are 
isomorphisms, from the previous proposition, it follows that 


r(T) = r(fweT) = r((T')ofy) = r((T'y’). t 


Corollary 3.4.6 Let T : V —> W be a linear transformation between finite- 
dimensional vector spaces over a field F. Then 


r(T) = r(T"). 


3.4 Rank and Nullity 85 


Proof Lettr = r(T). Let {y) = T(x1),y2 = T(x2),---,y, = T(x,)} be 
a basis of T(V). Enlarge the linearly independent subset {31, y2,..., y-} to a 
basis {1, ¥2,---5 Vrs Yeti +++> Ym} Of W. Consider the corresponding dual basis 
{Vis Yao eee Veo Yeu ..+, Yn} of W. Then yi(y;) = Oforalli<rands>r-+1. 
This means that y*(T(V)) = {0} foralls >r+1.Thus, T'(y*) = y*oT = O for 
alls > r+1. This shows that T'(W) is generated by {T' (yf), T’ (3), ..., T’Q7)}. It 
follows that r(T‘) < r(T). In turn, r((T')') < r(T’) < r(T). Already by Corollary 
3.4.5,r(T) = r((T"')'). Hence r(T) = r(T‘). tt 


3.5 Matrix Representations of Linear Transformations 


Let V; and V> be vector spaces of dimensions m and n respectively over a field F’. 
Let {x1,X2,...Xm} be a basis of Vi, and {y1, y2,..., yn} a basis of V2. We have a 


Hy see 


My (T) = [ay], 


KY 9 X2y000, Xm 
where 
n 
T(xj) = Yay aijyi- 


This map M?!'??""}" is called the matrix representation map of linear transfor- 
mations with respect to bases {x1, x2,...,%m} and {y1, y2,...y,} of Vi and Vo, 


respectively. 
Suppose that 
Mp1) = MER MT) = lay 
Then 


TO). = Byagy = 2°) 


for each j. But, then the effect of T and T’ are same on the basis {x1, %2,..., Xm}. 
This means that 7 = T’. Hence M}}°??"""?" is an injective map. Further, given any 
[ajj] € Mnm(F), there is a unique linear transformation T from V; to V2 whose effect 
on the basis {x), x2, ..., Xm} 18 given by 


P(xj) = Bi iGgyi- 


Clearly, Mz)3.32(T) = [aj]. Thus, Mzi"3>""” is a bijective map. Let 7, T> be 


“Xm 


members of Homp(V,, V2), and a,b € F. Suppose that Mii (Ti) = [ais] 


XQ ,e005. 


But, then 


86 3 Linear Transformations 
(aT; + bT2)(xj) = Dj_, (aaj; + bij) yi. 
Hence 


MP1 2 wal +bTh) = alai;] + bl bij] = 


X1 XQ, 0005 


aM?!¥?3" (T,) + pM Po CD). 


X15 X050003 Xm q5 8D 538 


This proves the following proposition. 


Proposition 3.5.1 The matrix representation map M. ie fade from Hom r(Vj, V2) 


to Mim(F) with respect to bases {x1, X2,...,Xm} of Vi and {y1, y2,..-, Yn} of V2 is 


a vector space isomorphism. tt 
Next, let V, be a vector space with a basis {x,, x2,...,Xm}, V2avector space with 
a basis {y, y2,..., Yn}, and V3 a vector space with a basis {z), Z2,..., Zp} all over 


the same field F. Let T; : Vj —> V> and T) : V2 —~> V3 be linear transformations 
given by 7\(xj) = Xi_,aijy; and To(yj) = LP bei ze. Then 


(oT )(xj) = T(Byaijyi) = VyaijTOv) = ViLyaij Vey buze = 
Seg EL Pedy ee = D4 Cates 


where c,j = %j_, byja;;. Thus, 
Meter oli) = Teal = Pallas] = MES e* Me a). 


This shows that the matrix representation map with respect to fixed choice of bases 
preserves product also. In particular, we have the following proposition. 


Proposition 3.5.2 Let V be a vector space of dimension n over a field F witha basis 
{x1,X2,...,Xn}. Then Mee 2 i is an isomorphism from the algebra Endr(V) of 


endomorphisms of V to the algebra M,(F) of n x n matrices with entries in F. { 


Corollary 3.5.3 Let T € Endr(V) and {x\,X2,...,Xn} a basis of V. Then 
Mie induces an isomorphism from GL(V) to GL(n, F). 


Bie eee 


Proof GL(V) is the group of units of Endr(V) and GL(n, F) is that of M,(F). 
The result follows from the above proposition if we observe that an isomorphism 
between algebras induces isomorphisms between their group of units. tt 


The following corollary is consequence of the above corollary and the Corollary 
2. AP. 


Corollary 3.5.4 Let F, denote a finite field with q elements. Then the order of the 
group (the number of nonsingular n x n matrices) GL(n, F,) is 


(q" — 1)(q" —q)---(q" —q""!). i 


3.5 Matrix Representations of Linear Transformations 87 


Corollary 3.5.5 Let V; and V2 vector spaces over a field F of same dimension n. 
Let {x1, X2,...,Xn} be a basis of Vi and {y1, y2,..-, Yn} that of V2. Then a linear 
transformation T from V, to Vz is an isomorphism if and only if i bee cae (T) is 
invertible n x n matrix. Also if T~! exists, then 


Me EO) = Mee). 


X1,%2;- Vs Y25-++5Yn 


Proof Clearly, 


Mier Ly, ) = I, =— Me Ty) 


X1 5X25 00Xn Mis V2.0 


Further, since 


lana (T'oT) — 7 bee maee si (T’) 3 Baca (T), 


X1,X2, 1sY2oe0%5 X1XQs+0+5 


and 


Me rar) = M1320 wT) > Me pete *“(T') 


Mis Y2o00+ X1 X25 0005 M15 25-05 Yn 


for all linear transformations T from V, to V2, and all linear transformations 7’ from 
Vz to Vi, the result follows. Evidently, if T~! exists, then 


Oe Sy = ar). t 


Ki Roig3% Y1s 25-065 Yn 


In particular, we have the following corollary. 


Corollary 3.5.6 Let V be a vector space of dimension n. Let {x,,X2,..., Xn} and 
{V¥1, ¥2,---, Yn} be bases of V. Then Mee te es Cy) is invertible, and its inverse is 


gens 


Mx (Ty), t 


Vis V2.0 Yn 


The matrix M?)°33"""?" (Iv) is called the matrix of transformation from the basis 
Y1sY25-++5Yn 


{x1, X2,..., Xn} to the basis {y), y2,..., Yn}. Thus, My.'350-%, Uv) = [aij], where 
Xj => Ur ij yi- 


Example 3.5.7 Defineamap T : R? — R? by T((a, b, c)) = (a+b+c,a—b-+c). 
Then T is a linear transformation (verify). Letx; = (1,1, 1), x» = (1,2, 1), »3 = 
(1,2,0), y. = (1,2), and y» = (1, 1). Suppose that ajx; + dox2 + a3x3 = 
(0, 0, 0). Then a; +a2+a3 = 0, a; +2a2+2a3 = 0,anda;+a2 = 0. Solving we 
get thata; = ad) = a3 = 0. This shows that {x,, x2, x3} is linearly independent. 
Since dimension of R? is 3, it follows that {x1, X2, x3} is a basis of R?, Similarly, 
{y1, y2} is a basis of R?. Now, suppose that Mz)'3)..,(T) = [aj]. Then (3,1) = 
T(0,1,)) = Te) = any + aay. = (ai + 421, 2411 + a21). This shows 


88 3 Linear Transformations 


that aj; +42; = 3 and 2a; + a2; = 1. Solving, we get thata;,; = —2, ay, = 5. 
Similarly, looking at T (x2) and T (x3), we find thatajz = —4, do. = 8, a3 = —4, 
and a3 = 7. Thus, 


; ye ee | 
1 Ber x3(T) a | 5 8 7 |: 


3.6 Effect of Change of Bases on Matrix Representation 


Proposition 3.6.1 Matrix representations of a linear transformation with respect to 
different pair of bases are equivalent to each other. Conversely, if A and B arem xn 
matrices which are equivalent, then they represent same linear transformation with 
respect to a suitable pair of bases. 


Proof Let V; and V2 be vector spaces over a field F of dimensions m and n, respec- 
tively. Let T from V; to V2 be a linear transformation. Let {x1, x2,..., x} and 
{x}, x5,...,x/,} be bases of Vi, and {y1, yo,..., yn} and {yj}, y5,..., 7} be those 
of V>. Since T = Iy,oToly,, and matrix representation preserves product, we have 


MR fs seed 


M12: aT = = My Y2s0005) my, Mi Vase % "(T)M. 


XL XQ yee Xm FV Vo seees We STZ F X4 Hoy sees 


ies ™ (Ty, ). 


wien asaya 


By the Corollary 3.5.6, My pile y Tvs) and My)"x; "(Iy,) are nonsingular. This 
shows that the two matrix represcatations are equirvalent Conversely, let A = [aj;] 
and B = [b;;] ben x m matrices which are equivalent. Let P be n x n nonsingular 
matrix, and Q am x m nonsingular matrix such that A = PBQ. Let {x1, x2,-++Xm} 


be a basis of V; and {y,, y2,..., Yn} a basis of V2. Define T from V, to V2 by 


Xm 
“Xm 


T(xj) = ULyaijyi. 


Let P = [px] and Q = [y;]. Take y; = Dp_ymeiye, and xi = Lyx. 
Since P and Q are invertible {Yi y5,---,¥,} is a basis of V2, and {x}, x5,...,x/,} 
is a basis of V;. Also M32? Uy) = = P and Mi on ox" (y,) = Q. But, then 


MP OT = PAO = 8, tt 


Since every matrix is equivalent to a matrix in normal form, we have the following 
corollary. 


Corollary 3.6.2. Let T be a linear transformation from V, to V2. Then there exists 
a basis {x1, X2,...,Xm} of Vi, {¥1, y2,---, Yn} Of V2, andr < (min(m, n) such that 
T(x) = y; foralli <r, andT(x;) = Oforalli > r. tt 


Recall that two square n x n matrices A and B are said to be similar if there is a 
nonsingular n x n matrix P such that PAP! = B. 


3.6 Effect of Change of Bases on Matrix Representation 89 


Corollary 3.6.3 Let V be a vector space of dimension n over a field F. Let 


{x1,%2,..-,Xn} and {x}, x5,...,x)} be bases of V. Let T be an endomorphism 
Ml pA digveesn : be XX 750 Xy ¥: 
of V. Then Myvi} (LT) is similar to Myx (1). Conversely, if A and B are 


nxn similar matrices, then there are bases {x,, X2,..., Xn} and {x}, x5,...,x,}, and 
- * 2 ie i ol 

a linear transformation T such that M{)2."""(T) = A and M.; (TT) = B. 
ia 12422" y, 


HY Rg e225. 


Proof The result follows from the Corollary 3.5.5 (look at its proof), if we observe 
that the inverse of 1 ae (Ivy) is Mee (ly). it 
Example 3.6.4 Consider the usual vector space R* over R. Consider the bases 
{x1, x2, x3} and {y1, yo, y3} of R°, where x, = (1,1,1), x» = (1,2,4), 233 = 
(,3,9), y = CU,2,4), yw» = C,3,9), and y3 = (1,4, 16) (verify that these 
are bases). Let T be a linear transformation such that 


X1,X2,X3 


tf 
Mi23(T) = | 0 2 
11 


wre © 


31,Y25Y3 


Suppose that My)"x5x; Zr3) = [aij]. Then, x; = Ips(x1) = ayy, +421 y2 +4313. 
Hence | = ay; +42, +431, 1 = 2a), +3a2)+4a3), 1 = 4a);+9a2) + 16a3). Solving 


ai = 3, dy = —3, a3; = 1. Similarly, looking at the representations of x2 and 
x3 in terms of y;, y2 and y3, we find that aj2 = 1, a2 = 0 = ay = a3 = A33, 
and a23 = 1. Thus, 
3 10 
Me) = |-3 01 
1 00 
Similarly, 
00 1 
Mee (Iga) = | 1 0 —3 
01 3 


It follows that the above two matrices are similar. Further, 


MY192>¥3 (T) = M1923 (1 p3) Mi228 (T) M2123 ([pr3) 
C1 ; : 


152593 X1X2,X3 X1,X2,X3 J1+Y25Y3 


Substituting the values and multiplying we obtain that 


5 1 -9 
MY 2(T) = | -2 3 13 
1 0-2 


90 3 Linear Transformations 


Proposition 3.6.5 Let V, and V> be vector spaces over afield F. Let {x,, x2, ..., Xn} 
be a basis of V, and {y1, y2,... Ym} a basis of V2. Let 
{xf,x3,.-.,x7} and {yf}, y3,..., ¥,} be corresponding dual bases of Vand Vy 


respectively. Let T be a linear transformation from V, to Vz. Then 


I) Sat"). 
Proof Let 
Me we (T) = [aij]. 
Then 
P (xj) = DF ag yi- 
Suppose that 


roy = DD jex}. 
By the definition T'(yf) = yfoT. Hence 


(ygoT (x1) = Ui Djex7 1) 


1.e., 
ye(T (x1) = die. 
Now 
YTD) = YELL ay) = VL yaiiy~EOi) = au- 
This shows that [a;;]' = [bji]. t 
Exercises 


3.6.1 Let F be a field. Show that the vector space F” is isomorphic to the vector 
space F’”” if and only ifn = m. 


3.6.2 Define a map T from R? to R® by T((x,y,z)) = (x-—y,y—Zz 
z — x). Show that T is a linear transformation. Find its matrix representation 
with respect to the standard bases. Also find its matrix representation with respect 
to the basis {(1, 1, 1), 1, 2, 3), (1, 4, 9)} of the domain, and the basis {(1, 1, 0), 
(0, 1, 1), C1, 0, 1)} of the range. Show that T (R°) is the subspace of R? represented 
by the plane x + y+ z = 0. What is N(T)? Find the rank and the nullity of T. 


3.6 Effect of Change of Bases on Matrix Representation 91 
3.6.3 Leta@ = (a, a, a3) bea fixed vector in R?. Define a map T from R?toR by 
T(r) = r-a (scalar product) 


Show that 7 is a linear transformation. Interpret the kernel of T if @ 4 0. Find its 
matrix representation with respect to the standard bases. What is the rank, and what 
is the nullity of T. 


3.6.4 Consider the subspace W = {(la,ma,na) | (l,m,n) 4 0 and a € R} of 
IR?. Show that the quotient space IR?/W is isomorphic to the subspace represented 
by the plane /x + my +nz = 0. 


3.6.5 Let f be a linear functional on R*. Show that there is a vector @ in R? such 
that f(r) = 7 -a (the scalar product). 


3.6.6 Determine a linear transformation from R? to R? whose kernel is /x + my + 
nz = 0. 


3.6.7 Fine the number of linear transformations on a vector space V of dimension 
n over a field F, containing g elements. 


3.6.8 Let V be a vector space of dimension n over a field F. Let {x1, x2, ...,X,} bea 
basis of V. Let T; and 7} be linear transformations on V. Show that T;oT — T,oT, is 
also a linear transformation on V. Show that (x*o(T, oT; — T,0T,))(x;) = Oforalli. 
Deduce that 7;07, — ToT; can never be the identity map. 


3.6.9 Let T bea linear transformation on V. Let us call T a nilpotent endomorphism 
if T” = O for some m. Suppose that T is nilpotent. Show that Jy + T is an 
automorphism of V. Find the inverse of Jy + TifT” = 0. 


3.6.10 Let T € End(V) = Homf(V,V). Let f(X) € F[X]. Define an element 
f(T) € End(V) by 


f(T) =aol +a,T +-:- + ar. 


where f(X) = dg ta,X + dy X* +++ Ay X". Suppose that V is finite dimensional. 
Show that there is a nonzero polynomial f(X) € F[X] such that f(T) = 0. 

Hint. If dimV = n,thendimEndV = n?,andso {Iy, T, T?,..., T™} is linearly 
dependent. 


3.6.11 Show that End(V) is a F[X] module with respect to the external operation 
- given by f(X)-v = f(T)(v). 


3.6.12 Let T : R*? —> R?’ bealinear transformation which preserves angle between 
vectors in the sense that if P and Q are points in R?, then the angle between O P and 
OQ, where O is origin, is the same as the angle between OT (P) and OT (Q). Show 


92 3 Linear Transformations 


that either T is a reflection about a line passing through origin, or it is a rotation in 
the plane through an angle a 

Hint. Suppose that there is a point P different from the origin such that T(P) = P. 
Then show that T = /, or itis a reflection about the line passing through O and P. 
Next, if T fixes no point other that O, then show that T((1,0)) = (cosa, sina) for 
some q@, and then show that T((x, y)) = (xcosa + ysina, —xsina + ycosa). 


3.6.13 Show that any angle preserving endomorphism of R? is either a rotation about 
a fixed axis, or a reflection about a plane passing through origin. 


3.6.14 Let 9, denote the vector space of polynomials over the field IR real numbers 
of degrees at most n. Define a map T from 69, to itself by 


T (Go -PagX + aX? Baia, 8") = ay Baek 4 ag? ev ena, xX", 


Show that T is a linear transformation. Find its rank and nullity. Is T invertible? 


3.6.15 Let C°(R) denote the vector space of real-valued functions on R which are 
r-times continuously differentiable functions for all r. Define a linear transformation 
D*? — 2D + 1 from C™(R) to itself by 


(D? = 20 - Tio = 


2 F(X df (X 
d* f(X) 2 + £0. 


dx? Xx 
Find the nullity of D* — 2D + 1, and also a basis of the kernel. 


3.6.16 Let V be a vector space of dimension m over a field F, of order g, and W 
a vector space of dimension n over F,. Suppose that m < n. Find the number of 
injective linear transformations from V to W. 


3.6.17 Suppose that m > n in the above exercise. Find the number of surjective 
linear transformations from V to W. 


3.6.18 Let V be a vector space of dimension n over a field F. Let {e1, e2,..., en} 
be an ordered basis of V. Let p be a permutation in S,. Then we have a map T, 
from the ordered basis {e1, €2,...,@,} to V given by T,(e;) = pq). Show that 
p ~> T, defines an injective homomorphism from S,, to the group GL(V) of all 
automorphisms of V. Deduce that every group of order n is isomorphic to a subgroup 
of GL(V). 


3.6.19 Let V bea finite-dimensional vector space over a field F. Let T, S € End(V) 
such that SoT = Iy (ToS = I,). Show that ToS = ly (SoT = Iy). 


3.6.20 Show by means of an example that the above result is not true for infinite- 
dimensional spaces. 

Hint. Let V be the vector space of all real-valued continuous functions on [1, 00) 
over the field R of real numbers. Consider the map T given by 


3.6 Effect of Change of Bases on Matrix Representation 93 


T(f)(x) = | - f(dy. 


Use the fundamental theorem of integral calculus. 


3.6.21 Let V be a vector space of dimension n over a field F', and T be an element 
of the center of Endy. Then T = aly forsomea é€ F. 


Proof Let x € V,x 4 0. We show that there is aA, € F such that T(x) = A,x. 
Suppose not. Then {x, T7(x)} is linearly independent, and so it can be embedded 
in a basis {x, T(x), x3,---xX,} of V. Define a linear transformation S by S(x) = 
x, S(T(x)) = 0, and SQ) = O forall i > 3. Then ST(x) = 0, where 
as TS(x) = T(x) 4 0. Thus, for all x € V there exists a rx € F such that 
T(x) = A,x. Now, we show thata, = Ay, x #0 #4 y. Suppose that 2, 4 A,. Then 
Axy-y(®—y) = Tix—y) = Ayx —Ayy. But, then (Ay —A,_y)x = (Ay —Ax_y)y. 
Since A, # Ay, {x, y} is linearly independent. Hence A, = A,» = Ay. This 
shows that T = Aly for some i. tt 


3.6.22 Let V be a vector space of dimension n over a field F. Let T be a linear 
transformation on V which commutes with each element of the group GL(V). Then 
T = aly for some a € F. In particular, Z(GL(V)) = {aly |a € F*}. 


Proof We again show that for each x € V, there is an element A, € F such that 
T(x) = A,x. The rest will follow as in the above exercise. Suppose that there is a 
x € V,x € 0 for which there is no 4 € F such that T(x) = Ax. Then {x, T(x)} 
is linearly independent, and so it can be embedded in a basis {x, T(x), x3,...X,} of 
V. Let S be a linear transformation defined by S(x) = x, S(T(x)) = —T(x), and 
S(x;) = x; foralli.Then, since S takes a basis to a basis, itis anelement of GL(V). 
Further, 7S(x) = T(x) where as ST(x) = —T(x). Since T(x) £0,TS A ST. 


3.6.23 Let V bea finite -dimensional vector space over a field F. Let P be a nontrivial 
subspace of End;(V) such that ToS and SoT belong to I for all S € End(V) and 
T ¢€ Tl. Then? = End(V) (in the language of ring theory, this is expressed by 
saying that the ring End(V) has no proper two-sided ideals). 


Proof Let T be a nonzero linear transformation in I’. Let {x;, x2, ..., X,} be a basis 
of V. Since T ¢ 0, T(x;) € O for some i. Without any loss of generality, we 
may assume that T(x,) 4 0. Take any 7. There is a linear transformation S such that 
S(T (x1)) = x; (embedded {7 (x;)} into a basis). There is also a linear transformation 
S’ such that S’(x,;) = x, and S'(x;) = 0 forall j > 1. Thus, STS’(x}) = x;, 


and STS'(xj) = 0 forall j > 2. Hence STS’ = T;;. By the hypothesis, 
Ty; € T. Also Tj; = 1j;Tj,; € VT forall j. Thus, I is a subspace of Endr(V) 
containing all 7;;. Since {T;;, 1 <i <n,1< j <n} isa basis of the vector space 
Endr(V),T = Endf(v). tt 


3.6.24 Let V be a finite-dimensional vector space over F'. Show that GL(V) gen- 
erates End-(V) as a vector space. 


94 3 Linear Transformations 


3.6.25 Let T € End;(V) be such that T7 = T, where V is finite dimensional 
(such a transformation is called idempotent). Show that every element x of V can be 
uniquely expressed asx = y + z, where T(y) = Oand T(z) = z. 


3.6.26 Let T € End;(V) be nilpotent, and f(X) € F[X]. Show that f(7) is an 
automorphism of V if and only if f(X) has a nonzero constant term. 


3.6.27 LetT € End;-(V),anddim(V) = n. Show that there is a monic polynomial 
Ff (X) such that f(T) = 0. Smallest degree such a monic polynomial is called the 
minimum polynomial of 7. Show that 7 is an isomorphism if and only if the 
minimum polynomial of T has nonzero constant term. Deduce that T~! = g(T) 
for some polynomial g(X). 


3.6.28 Show that g(7) = 0 if and only if the minimum polynomial of 7 divides 
g(X). 


3.6.29 Let V be a finite-dimensional vector space over a field F'. Let T be a nonzero 
element of Endr(V). Show that there is a S € Endpr(V) such that ST ~ 0 and 
(ST)? = ST. 


3.6.30 Let T € Endg(V) — GL(V) — {0}. Show that there isa S € Endr(V) such 
that ST = Obut TS 4 0. 


3.6.31 Find the minimum polynomial of the linear transformation in Exercise 3.6.2. 
3.6.32 Show that Z(GL(V)) is isomorphic to the multiplicative group F*. 


3.6.33 Find the order of a Sylow p-subgroup of GL(V), where V is a vector space 
of dimension n over Z,. Find also a Sylow p-subgroup. 


3.6.34 Let V and W be vector spaces of dimensions n and m, respectively, over a 
finite field F, of order g. Let r < min(n, m). Find the number of linear transforma- 
tions of rank r. 


3.6.35 Let T; and T> be endomorphism of a vector space V of dimension n over a 
field F. Show that 


r(T\oT,) > r(T;) + rh) — 1. 


3.6.36 Let V be a vector space of dimension 2 over Z2. Show that GL(V) is iso- 
morphic to $3. 


3.6.37 Let T € Endr(V), where V is a vector space of dimension n. Suppose that 
T’ = 0. Show that T” = 0. 


3.6.38 Show that every linear functional f on V defines a linear transformation T' 
from Endr(V) to V* by T;(T) = foT. Let V be a vector space over F, and 
{x1,X2,...,X,} be a basis of V. Consider p = p; + po +---+ Py, where p; is 


3.6 Effect of Change of Bases on Matrix Representation 95 


the ith projection linear functional on V with respect to the given basis. Show that p 
is independent of the choice of basis. Show, further, that the linear functional Tr on 
End,(V) defined by Tr(T) = py(T (x1)) + po(T 2) +++ + Pa(T (%q)) is also 
independent of the choice of basis of V. This is called the trace form on Endf(V). 
Show that TS — ST € KerTr forall S,T € Endg(V). Does Tr(T’) = o implies 
that T’ = TS — ST forsome S,T € End-(V)? 


3.6.39 Let f bea linear functional on End; (V), where V is as in the above exercise. 
Suppose that f(AB — BA) = O forall A, B € Endr(V), and fUy) = n. Show 
that f = Tr. 


3.6.40 Show that AB — BA can never be an automorphism of V, where V is finite 
dimensional. Deduce that Jy + AB — BA can never be nilpotent. 


3.6.41 Show that the subgroup of the additive group End (V) generated by GL(V) 
is Endr(V). 


3.6.42 Let T be a linear transformation from V to V. Show that the following 
conditions are equivalent: 

(i) N(T)(\imageT = {0} 

(ii) T?(x) = Oimplies that T(x) = 0. 


3.6.43 Let T be a linear transformation on V such that the rank of T is same as rank 
of T*. Show that KerT ()imageT = {0}. 


Chapter 4 
Inner Product Spaces 


In the vector space theory, we have talked about points, lines, and planes as translates 
of subspaces of a vector space. In this chapter, we shall talk about the concepts of 
angle between lines (planes), distance between points, shortest distances between 
planes, area, volumes of parallelepiped, etc. We also discuss rigid motions in an inner 
product space. For this purpose, we enrich the structure of vector space by putting 
the concept of inner product. We have to consider vector spaces over particular types 
of fields. All fields F in this chapter will either be the field R of real numbers or the 
field C of complex numbers. We have a field automorphism a +> @ from C to itself 
called the complex conjugation. The restriction of the complex conjugation to R is 
the identity map. 


4.1 Definition, Examples, and Basic Properties 


Definition 4.1.1 Let F be the field R of real numbers or the field C of complex 
numbers, and V a vector space over F'. A map <> from V x V to F (the image 
of (x, y) under <> is denoted by < x, y >) is called an inner product (real if 
F = R, and complex inner product if F = C) on V if the following hold: 


l. <ax+ By, z>=a<x,z>+6<y,2> foralla,B¢F,andx,yeV. 
2. <x, y>= <y,x>forallx,yeV. 
In particular, <x, x >= <x, x > forall x € V,andso <x, x > is areal 
number for all x € V. 
3. <x, x >>0,and<x, x > = Oifand only ifx = 0. 


A vector space V together with an inner product <> on V is called an 
inner product space. 


© Springer Nature Singapore Pte Ltd. 2017 97 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_4 


98 4 Inner Product Spaces 
Putting a = 1 = (in 1, we obtain that 
2X bY 2S Sexe ZS bt yz Sy 
and putting @ = O in 1, we obtain that 
<ax,y>=a<x,y>. 


Next, using 2 and 1, we get 


<x,ay + 6z>= <ay+ 6z,x>=a<y,x>+6<2z,x> 


= G<y,x> + Bez, x> = <x, y> +B<x,z>. 


Thus, _ 
<x,ay+ Pi>=4<x,y>4+6<x,z> 


forallx,y,zéV,anda,Ge F. 
Putting a = | = £ in the above equation, we obtain 


<xX,y+tz7>H<x,yr>t+ <%, 27>, 
and putting G = 0, we get 
<x,ay>=Qa<xX,y>. 


Further, 
<0, y>=<0-0, y>=0-<0, y>=0, 


and similarly, 
<x,0>= <0,x>=0 


for allx,y eV. 


Example 4.1.2 Let V = R"” be the Euclidean vector space of dimension n over R. 
Define 


<X, Y>= XY + X2y2 +++ + nn, 
where X = (x1, %2,.-..,Xn) andy = (y1, y2,..-, Yn). This gives an inner product 


on R” (verify). This inner product is called the usual standard Euclidean inner 
product on R’. 


Example 4.1.3 We have another inner product < > on R? given by 


<X, Y>= X11 + X2y2 + 2x3y3 + x2¥3 + X32, 


where X = (x1, X2,x3),and y = (yj, y2, y3) (verify). 


4.1 Definition, Examples, and Basic Properties 99 


Example 4.1.4 Let V = C” the complex vector space of dimension n. Define < > 
on C by 
<X, Y>= XV + X2V2 ++ + XN. 


Then it is a complex inner product (verify). This inner product space is called the 
standard unitary space. 


Example 4.1.5 Let V = C?. Define <> by 
<x, y>= 4x1) + xX2y2) + i(x1y2 — X2Y1). 


Then it is a complex inner product (verify). 


Example 4.1.6 Let V denote the complex vector space of all complex-valued con- 
tinuous functions on [0, 1]. Define < > by 


I 
a ea ec [ f (x)g(x)dx. 


Then < > is a complex inner product(verify). 


Example 4.1.7 Letl ? denote the set of all real sequences {a,,} such that U2, | a, 7 
< oo. Then it is a vector space over R with respect to the usual addition of sequences 


and multiplication by scalars. We have an inner product on /? given by 
< {dn}, {bn} > = wan Dn. 
Let < > be a real (complex) inner product on R” (C”). Let E = {€7, @3,..., &} 


denote the standard basis of the vector space R” (C”). The inner product < > 
determines a matrix A = [aj;], where aj; = < @j, @; >. Since ajj =< @;, @j > 


=<@j, @& > (<j, & >) = aji(aj;), A turns out to be symmetric (Hermitian 
matrix). The matrix A, in turn, determines the inner product. Indeed, if x = 
[x1,%2,...,%n] = x1@; +x2@2 +--- + Xn€n, and y = [y, V2.++05 Yn] = yet 
yo@a +++++ yn@;, then < x,y > = XA’ AY"). Not all symmetric (Hermitian) 
matrices determine inner products in the manner described above. For example, 
consider the real symmetric matrix A given by 


hae 


Then [1, —1]A[1, —1]’ = 0, where as [1, —1] 4 [0, 0]. As such, A will not deter- 
mine inner product (observe that A is invertible also). Indeed, as we shall see, a 
matrix A determines an inner product, as described, if and only if there is a non 
singular matrix B such that A = BB'(BB*). 


100 4 Inner Product Spaces 


Definition 4.1.8 Let (V, < >) bean inner product space. Letx € V.Then < x, x > 
is a non negative real number. Its non negative square root is called the length of the 
vector x, and it is denoted by || x ||. Thus, || x || = +/<x, x >. 


Clearly, 
|| x || = Oifand only ifx = 0. 
Also, since < ax, ax > = a@<x, x >=|a|*|| x ||?, we have 


| ax || =] I] x || 


foralla € Fandx € V. 


Theorem 4.1.9 (Cauchy—Schwarz inequality) Let (V, <, >) be an inner product 
space. Then |< x, y >|<|| x || || y || forallx, y € V. The equality holds if and only 
if {x, y} is linearly dependent. 


Proof If y = 0, then both side is 0, and the equality holds. Assume that y # 0. 
Then <x — ay, x — ay >> Oforalla ce F. Thus, 


<xX,xX > --a<x,y> -a<y,x>+aa<y, y>=0. 
Putting a = =a = in the above equation, and noting that <x, y > =< y, x >, 


we obtain that 


<x, Y><X,y> <x, yYy><X,y> 
<x, xX > 
<y,y> <y, y> 
<x, y><X,y> 
5 = yes O, 
<y,y> 
or 
<x, y><X, y><<X,xX >< y, y>, 
or 


2 2 2 
I<x,y>l Six Illy ll’. 

Taking square root, we obtain that 
I<x, y>lSIlaIlllyll. 


If {x, y} is linearly independent, then x — ay #0 for all a € F. Hence the 
inequality becomes strict inequality, and so in this case 


l<x, y>l<IIxIlllyll- 


Conversely, if {x, y} is linearly dependent, then x = ay for some a € F. But, 
then 


4.1 Definition, Examples, and Basic Properties 101 


l<x, y>l=l<ay, y>l=lall<y, y>l=lalllyl?=lealliyliyll 
=I! x Illy Il- 


tt 


Applying the Cauchy—Schwarz inequality for Examples 4.1.4, 4.1.6, and 4.1.7 
respectively, we obtain the following corollaries. 


Corollary 4.1.10 (Cauchy inequality). Let x), .x2,...,%n, V1, Y2.--+,¥n be com- 
plex numbers. Then 


oma ls JEL la PY EL by | 
In particular, the inequality holds for real numbers also. tt 


Corollary 4.1.11 Let f and g be complex valued continuous functions on [0, 1]. 
Then 
I 1 1 
if roag@var is ff | toy Pax | [| gee Pate. 
0 0 0 
t 


Corollary 4.1.12 If {xn} and {yn} is sequence of real numbers such that XU , | Xn 7 
< cand UX, | yn [> < 00, then 


Seine 4 25) Be Sea ee 


Find the inequalities coming out of the Examples 4.1.3 and 4.1.5. 


Proposition 4.1.13 (Triangle inequality). Let (V, < >) be an inner product space. 
Then 


lx + yllSllall + lly 


for allx, y € V. Equality holds if and only if {x, y} is linearly dependent. 


Proof 

(lx + y |)? 

=|<x+y,x+ y>| 
=|<x,x>+<x,y>4+<yx>+ <y, y>| 
<|<x,x >| + |<x, y>| 4+ |l<y,x >| 4+ l<y, y>| 
<|)x |? +2 |x Illy ll + Iy |? (ey Cauchy-Schwarz) 
= (I xl + lly Ib. 

Taking the square root, we get 


102 4 Inner Product Spaces 


Ila + yll Sila ll + Ilyll- 


Further, it is clear from the above that equality holds if and only if 


<x%,y> + <y,x¥>= 2|[x]l- Il yl. 
This is so if and only if |< x, y >| = || x || - || y || (Cauchy—Schwarz inequality). 
Again, from the second part of Cauchy—Schwarz, it follows that the equality holds 
if and only if {x, y} is linearly dependent. ft 


If we apply the above proposition to Examples 4.1.4, 4.1.6, and 4.1.7, we get the 
following corollaries: 


Corollary 4.1.14 Jf x), x2,...,%Xn, V1, ¥2.-+++ Yn are complex numbers (or in par- 
ticular real numbers), then 


(ala tw <JB la + JBL ly P. f 


Corollary 4.1.15 Jf f and g are two complex-valued continuous functions on [0, 1], 
then 


1 1 1 
I ira) + anras < J f repr ar+ ff lg(x) Px. 


Corollary 4.1.16 Jf {a,} and {b,} are sequences in I?, then 


JEM lan + Ba F< JEM) | ae 2 + VE qa100 | Be P. t 


Notion of Distance in an Inner Product Space 


We first introduce the notion of distance on a set by abstracting the fundamental 
properties of distance. 


Definition 4.1.17 Let X be a set. A map d from X x X to the set R* {0} of non 
negative real numbers (the image of (x, y) under d is denoted by d(x, y) instead of 
d((x, y))) is called a distance or a metric on X if 


1. d(x, y) = Oifand only ifx =y. 
2. d(x,y) = dQ). 
3. (Triangle inequality) d(x, y) < d(x, z) + d(y, z) 
for all x, y, z € X. The pair (X, d) is called a metric space. 


4.1 Definition, Examples, and Basic Properties 103 


Proposition 4.1.18 Let (V, < >) be an inner product space. Then the inner product 
< > induces a metric d on V defined by 


d(x,y) =||x — yll 


Proof d(x, y) =||x — y||=0,andd(x, y) = oifand only if || x — y ||= 0. 
This means that d(x, y) = Oif and only ifx — y = O, or equivalently,x = y. 
Since 


Ila — yll=ll-l@ — x) Il=!—-tl lly — x ll=lly — «Il, 


it follows that d(x, y) = d(y, x). 
Also 


d(x,y) =|lx — yl]J=llx ~z+z-—yllsilx - zl] 4+ llz- yll 
= d(x,z) + d(z,y). 4 


Remark 4.1.19 Let (V, d) be a metric space. It may not be possible to define a vector 
space structure on V, and an inner product on V so that the induced metric is d. For 
example, take a nonempty set V, and define a metric d’ on V by d’(x, y) = Oif 
x = yand | otherwise (verify that d’ is indeed a metric called a discrete metric). 
Given any inner product space structure on V, and x 4 0 in V, and d the induced 
metric, then d ay 0) = 2, and so d’ can not be induced by an inner product. 


Notion of Angle, Orthogonality 


Let (V, <>) be an inner product space. Then by the Cauchy—Schwarz inequality 
l<x, y>|<|l x || || y || forallx, ye V.Ifx 40F y, then 


I< x, y >| 
atl y i 7 


If it is a real inner product space, then 


<x, y> 
~ WT 


Thus, there is unique 0, 0 < 6 < 7 such that 


<x, y> 


cos? = ——_.. 
Hx lly I 


This 6 is called the angle between x and y. In case it is a complex inner product 
space, the above argument implies that there is a unique 6, 0 < 6 < 27 such that 


104 4 Inner Product Spaces 


<X,y> 


cos? + isin@ = ———_., 
Ix lly Il 


This 6 may be termed as angle between x and y. 

Any two vector x and y in an inner product space is said to be orthogonal if 
<x, y >= 0. This definition extends to the null vector 0 also. Thus, the null vector 
0 is orthogonal to each vector. The notation x | y is used to say that x and y are 
orthogonal. Thus, 

x L yifand only if <x, y>= 0. 
A vector x is called a unit vector if || x || = 1. 
Proposition 4.1.20 (Pythagoras Theorem). Let (V, <>) be a real inner product 
space, and x,y € V. Then x 1 y if and only if 
Ila — yIP=lxlP + Uy IP, 


or equivalently, 
2 2 2 
lx + ylP=ll xl + lly ll. 
Proof 
[x - y|PH<x-y,x-y>o=e<x,x> —-<x,y>—-—<y,x>+<y,y> 
=|Ix|? + Ilyl? -2<2x, y>. 
The result follows. tt 


Remark 4.1.21 In complex inner product space also ‘if x L y, then || x — y ||? 
= || x ||? + || y |/?(verify). But the converse is not true. Consider, for example, 
the unitary space C2. The vectors x = (0, 1) and (1, i) are not orthogonal but still 
lx — yIP=Ilx 1? + lly IP = 3. 


In a real inner product space (V, <, >), we have 
lx — yIP=lxIP + iy I? —2 1x I Ily ll cosd, 


and 
le t+ IP =H dP + Uy IP +2 Me IML y I cosd. 
These equations give formula for the diagonals of parallelograms in terms of sides. 


Proposition 4.1.22 (Parallelogram Law). In an inner product space (V, <>) we 
have 
lx — yl? + lle + y|P= 2114 IP +211 y IP 


forallx,yéeV. 


4.1 Definition, Examples, and Basic Properties 105 


Proof Adding equations 


le— KP Hele Se Uelr Hse res Sas 
and 
lx + yIP=Ile IP? + lly IP + <2, 9> 4+ <y,x> 
we get the result. tt 


The geometrical meaning of the above proposition is that the sum of the areas of 
the squares formed on the diagonals of a parallelogram is the sum of the areas of 
squares formed on the sides of the parallelogram. 

The following identities, termed as the polarization identities, relate the norm with 
the inner product. 


Proposition 4.1.23 (Polarization identities) 


1. If (V, <>) is a real inner product space, then 


1 2 2 
aa Fe = Gules = herd 


forallx,y eV. 
2. If (V, <>) is a complex inner product space, then 


_t 2 i “72 . 2 2 
<x, yY>= slllx + yl +i || x +iy || d+ajqlx il + ly IF 


forallx,yéV. 


Proof 1. Let (V, <>) bea real inner product space. Then 


I|xty|P=<xty,xty>=|/x IP + ly IP +2<x, y>..., 
(4.1) 
and 


Ilx-ylIP=<x-y,x-y>=|[x 1? + ll yl? -2<x,y>..., 
(4.2) 
for all x, y € V. Subtracting the second equation from the first equation, we get 
the desired identity. 
2. Let (V, < >) be a complex inner product space. Then 


llxty|Pa<xty, xty>=llal? + llyl? + <x, y> + <y,4>..., 
(4.3) 
and 


|x +iy |P=<xtiy, xtiy>=||x I? + llyl? -i<x,y> ti<y,x>... 


(4.4) 


106 4 Inner Product Spaces 


for all x, y € V. Adding the i times the Eq. 4.4 to the Eq. 4.3, we get the desired 
result. tt 


Let (V,<>) be an inner product space. A subset S of V is called 
an orthonormal set if 


(i) || x || = 1 forall x € S and 
Gi) <x, y>= Oforallx,yeS,xFy. 


Proposition 4.1.24 An orthonormal set in an inner product space is always linearly 
independent. 


Proof Let S be an orthonormal set and 
GX, + Ax. + +++ + AnXn = O, 
where x1, X2,...,X, are distinct elements of S. Then 
AnXm = < AX, + AQX2 +++ + AyXy, Xm > = O. 


Hence a,, = 0 forall m. t 


Proposition 4.1.25 (Bessels inequality). Let (V, <>) be an inner product space, 
and {X,,X2,...,X,} an orthonormal set, x; 4 xj; fori A j. Letx € V. Then 


3.4 aa eee Sle 
Proof We have 
<x — Dy <x, x) > xi, x — Diy <x, x > x > 20. 
Since {x;, X%2,...,x,} is an orthonormal set, expanding 
<x,x > —Diy <x, x ><x, > =O, 


or 
2 2 
|x ||P — Diy |< x, xj >|" = 0. 


Hence 


r vA 2 
ML, lox, oe SPella ir. z 


Definition 4.1.26 An orthonormal set which is also a basis is called an orthonormal 
basis. 


4.1 Definition, Examples, and Basic Properties 107 


Corollary 4.1.27 Let (V,<>) be an inner product space. An orthonormal set 


{X1,X2,--.,Xn} is an orthonormal basis if and only if 
[oP Shy lear Ss 
Proof Suppose that {x;, x2, ...,X,} is an orthonormal basis. Let x € V. Then 
X = QX, + AX. + ets + AyXp 
for some aj, Q2,...,Q@, in F. But, then <x, x; > = a; for alli. Hence x = 


n 
diy <x, Xx; > x;, and so 
<x — Diy <x, xj > xj, x — UL, <4, x4, >x, >= 0. 


Expanding, we get 
Ie IP = Ey l<x, a =P. 


Conversely, suppose that || x ||? = Dy I< x, Xi >|?. Then 


|x — E%, <x, 4 > x; (/? = 0, 


and so 

BS DL <a ey > Hj 
for all x € V. This shows that {x,, x2,...,X,} is a set of generators. Since an ortho- 
normal set is already linearly independent, it is a basis. ft 


4.2 Gram-Schmidt Process 


The proof of the following theorem gives an algorithm by which we can find an 
orthonormal basis of an inner product space starting from a set of generators for V. 
The process is called the Gram-Schmidt process. 


Theorem 4.2.1 (Gram-—Schmidt). Let (V, <>) be an inner product space. Let 


{x1,X2,...,X,-} be a finite subset of V. Then there exists an orthonormal set 
{¥1, ¥2,---, ¥s}, § <r such that the subspace generated by {x,, X2,...,X,} is the 
same as that generated by {y, Yo, ..., Ys}. 


Proof The proof is by the induction on r.Ifr = 0, then the subset is empty set, and 
since an empty set is (vacuously) orthonormal, there is nothing to do. Suppose that 


r = 1.Ifx; = 0, then < {x,} > = {0}, and empty set is again an orthonormal set 
which generates {0}. Ifx,; 4 0, take y) = —_. Then {y} is an orthonormal set which 


Tal 
generates < {x;} >. Assume the result forr. Consider a subset {x;, x2, ...X;, X41} of 


V. By our induction assumption there is an orthonormal subset {y,, y2,..., ys}, 5 < 


108 4 Inner Product Spaces 


r of V such that < {y1, y2,..-, Ys} > = < {X1,X2..., x} >. Ifx;41 belongs to this 
subspace, then there is nothing to do. Suppose that x4; ¢< {x1,%2,...,%,} =< 
{¥1, ¥2,---, Ys} >. Then 


s 
5 = Dy Sg HO EU 
Take < 
; Xrt4t — Diy < X41, Wi > Vi 
st] = e . 
| Xp — Ley <Xp41, Yi > Vi || 


Clearly, y,4, is a unit vector which is orthogonal to y; for each i < s. Evi- 
dently, {y,, yo, .--¥s41} iS an orthonormal set. Since y,,; is linear combination 
of {X-+41, V1, Y2,--+, Ys}, <{¥1, 2,---, Vsti} > is contained in < {y1, y2,..., Vs; 
Xp} > = < {xX1, X2,..., X41} >. Also x;4, 18 linear combination of {y,, yo,..., 
Ys4i}, and so < x1, X2,...,X,-41} > iS contained in < {yj, yo,..., Vsti} >. Thus, 
the result holds for r + 1 also. tt 


Corollary 4.2.2. Every finite dimensional inner product space admits an orthonor- 
mal basis. 


Proof Let (V, < >) bea finite-dimensional inner product space. Let {x1, x2, ..., Xn} 
be a basis of V. Then from the above theorem, there exists an orthonormal set 
{¥1, ¥2,---, ¥m}, mm <n which generates V. Since an orthonormal set is linearly 
independent, it is a basis, andin turnn = m. tt 


Proposition 4.2.3. Every orthonormal set of a finite-dimensional inner product 
space can be enlarged to an orthonormal basis. 


Proof Let {x,, x2, ..., Xm} be an orthonormal set of an inner product space (V, < >) 
of dimension n. Since an orthonormal set is linearly independent, m < n.Ifm = n, 
it is already a basis, and so an orthonormal basis. Suppose that m < n. Then 
< {X1,%2,...,Xm} > A V. Let yn41 be a member of V — < {x}, x%2,...,Xm} >. 
Then Yn41 — jt) < Ym+1, Xi > Xi A 0. Take 


Yn+1 — >in = Ym+1, Xi > Xj 


Xm4+1 = : 
I Ymt1 = phan <Ym41, Xi > Xi || 


Then {x,, %2,..., X41} 1S an orthonormal set. If m +1 = n, then this is an ortho- 
normal basis. If not proceed as above. At the (n — m)th step, we shall arrive at an 
orthonormal basis containing {x;, x2, ...Xm}. tt 


Example 4.2.4 This example is to illustrate the Gram—Schmidt process. Consider 
the usual Euclidean inner product space R°. Consider the subset {(1, 1, 1), (0, 1, 1), 
(2, 1, 1)} of IR*. We determine an orthonormal set which generates the same space 


as < {(1, 1,1), , 1,1), 2,1, 1)} >. Take x1 = pattpg = (Sq ye Ly), and 


y2 
IIyall 


Mn = , where 


4.2 Gram—Schmidt Process 109 


1 1 ee i i 
V3 3° V3 V3) /3 V3 


yo = (0,1,1)-— < (0,1, 1), ¢ ). 


Thus, x. = /3(-3,4, 4, Since 
(2, 1,1) — < (2,1, 1), Xy > X1 — < 2, 1,1), x2 > x2 = 0, 


it follows that the orthonormal set {x,, x2} generates the same space as {(1, 1, 1), 
(0, 1, 1), (2, 1, 1)}. Note that {(/, 1, 1), (0, 1, 1), (2, 1, 1)} does not generate R?. 


Let 


r| 
r2 
A= 
Th 
be an xX m matrix with rows {7],7,...,7;}. Then AA’ = [wij], where ujj = 
Fir; = i, 0) > (AA* = [wij]. where uy = Tirj* = <7j,7; >). Thus, to say 


that the rows of A form an orthonormal set is to say that AA’ = I, (AA* = I,). 
Dually, to say that the columns of A form an orthonormal set is to say that A'A = I, 
(A*A = I,,). In particular, we have the following definition. 


Definition 4.2.5 A square n x n matrix A with entries in the field R (C) of real 
(complex) numbers is called an orthogonal (unitary) matrix if the rows of A form 
an orthonormal basis of IR” (C”), or equivalently, AA' = I, = A'A(AA* = I, = 
A* A). Alternatively, A is orthogonal if and only if AS = A7!. 


Example 4.2.6 The 2 x 2 matrices 
cos? sin@ 
—sin@ cosé |’ 
cos? sind 
sin@ —cosé 


are orthogonal 2 x 2 matrices. Indeed, any 2 x 2 orthogonal matrix is one of the 
above two types (prove it). Observe that the linear transformation 


and 


cos? sin@ 
—sin@ cos0 


[x,y] + [x,y]: = [xcos@ — ysin@, xsinO + ycos6] 


determined by the matrix 


cos? siné 
—sin@ cos0 


110 4 Inner Product Spaces 


represents rotation of the x, y plane through an angle @. Interpret the linear transfor- 
mation determined by the second matrix. Indeed, it will represent the reflexion about 
a line in the plane (find it). 


Example 4.2.7 The 3 x 3 matrix 


i. 4 1 
3 V3 VB 
ie ee 
v6 v6 v6 
I_l 9 

v2 v2 


is an orthogonal matrix. 


If AA’ = J, = BB‘, then AB(AB)' = ABB'‘A' = I,. Thus, product of two 
n x n orthogonal matrices are orthogonal. Also, since A’ = A7!, the inverse of 
an orthogonal matrix is orthogonal. The identity matrix is obviously an orthogonal 
matrix. This shows that the set O(n) of orthogonal n x n matrices form a group 
under matrix multiplication. This group O(n) is called the orthogonal group of 
n Xn matrices. A subgroup of O(n) is called an orthogonal group. Similarly, the 
set U(n) of all n x n unitary matrices form a group, called the unitary group. 

The proof of the following proposition is algorithmic, and it is essentially the 
Gram-—Schmidt process. 


Proposition 4.2.8 Let A be an x m matrix with entries in the field R (C) of real 
(complex) numbers which is of rank n. Then, we can find a lower triangular n x n 
matrix P with positive diagonal entries such that PA(P A)' = I,(PA(PA)* = I,). 
Also, if A is of rank m, then we can find an upper triangular m x m matrix Q with 
positive diagonals such that (AQ)'AQ = In ((AQ)*AQ = In). 


Proof Since the rank of A is n, the rows {71,72,...,7,} of A form a linearly inde- 
pendent subset of IR”. Using the Gram Schmidt process, we transform the rows of 
A to getan xX m matrix B with the orthonormal rows {57, 52, ..., 5,}. Clearly, then 
BB' = I,. Further, while transforming the rows of A to orthonormal rows, we use 
only, the following types of elementary row operations in succession: 


(i) Multiply a row by a nonzero number, for example, 5; = wr 
(ii) Add certain linear combinations of rows preceding to j;, row to the j;p row, and 
then multiply it by a suitable positive number, for example, sy = 


T2—<12,81>S] 

[7—<Tz,51 >| ° 
We further observe that the corresponding elementary matrices are lower triangular 
with positive diagonal entries. Thus, B = PA forsomelowern x n triangular matrix 
P with positive diagonal entries. Finally, let A bean x m matrix of rank m. Then A’ is 
am xX n matrix of rank m. Applying the above result for A’, we get a lower triangular 
m X m matrix P with positive diagonal entries such that PA‘'(PA')! = I. Take 
Q = P'.Then Q is an upper triangular m x m matrix with positive diagonal entries 
such that (AQ)'AQ = I,. tt 


4.2 Gram—Schmidt Process 111 


Corollary 4.2.9 Let A be an xm matrix with entries in the field R (C) of real 
(complex) numbers which is of rank n. Then, we can find a lower triangular n x n 
matrix L with positive diagonal entries and an x m matrix Q with QQ! = I, 
(QQ* = I,) such that A = LQ. Also, we can find an upper triangular m x m 
matrix U with positive diagonals andan x mmatrix Q withQQ' = I,(QQ* = I,) 
such that A = QU. 


Proof Follows from the above proposition if we take L = P~', where PA = Q. ¢ 


Corollary 4.2.10 Let A ben x n invertible matrix. Then we can find a lower trian- 
gular n x n matrix P with positive diagonal entries such that PA is an orthogonal 
matrix. tt 


Corollary 4.2.11 Every invertible matrix can be decomposed as product of a lower 
triangular matrix with positive diagonal entries and an orthogonal matrix. It can 
also be decomposed as product of an orthogonal matrix with an upper triangular 
matrix with positive diagonal entries. tt 


Now, we illustrate the algorithms described above by means of examples. 


Example 4.2.12 The set S = {rj =(1,1,0),™m = (0,1, 1,73 = U, 0, 1} is a 
basis of R*, and therefore, the corresponding matrix 


110 
A= ]011 
101 


is invertible. We transform S in to an orthonormal basis using Gram—Schmidt process, 
and using the corresponding elementary row operations on A, we transform it to an 
orthogonal matrix O. Further by applying the same elementary operations on the 
the identity matrix 7/3; we obtain the lower triangular matrix P with positive diagonal 
entries such that PA = O. First 7; is replaced by 5; = # = ( woe a 0) and 
correspondingly, we multiply the first row of A by w to transform it to 


1 
i a9 
0 141 
1 Ol 


Further, we replace 73 by 53 = = = — = ( Je v4 : and we apply the 


corresponding elementary operations on the matrix to transform it to 


+ 0 
V2 J2 
1 il 2: 
-4 4,3 
QO! 7 


—_ 


112 4 Inner Product Spaces 


Fa— <3, 8j > 8)— <9; 82> 59 


= _ (l {4 
Finally, we replace r3 by 53 = oss ea] (FR Fr 7» and we 
apply the corresponding elementary operations on the matrix to transform it to the 
orthogonal matrix 


1 4 9 

J2 V2 
= 1 1 2 
O=|-% a V3 

V3 Vf. 3 


To get the the corresponding lower triangular matrix P so that PA = O, we apply 
the same elementary row operations in succession on the identity matrix J; to get 


1 
Fi 0 O 
_ tule 2 
P= ie 3 O 
1 _ 1 v3 
2/3 273 2 


Lastly, to find a lower triangular matrix L = P~! so that A = LO, we apply 
the inverses of the same elementary row operations on the identity matrix in reverse 
order to get 


0 
0 
2 
V3 


4.3 Orthogonal Projection, Shortest Distance 


Let (X, d) be a metric space, and A a subset of X. Then 
d(x, A) = g.l.b.{d(x,a) | a € A} (the greatest lower bound of the set {d(x, a) | a € A}) 


is called the shortest distance (or simply distance) between the point x and the set 
A. More precisely, d(x, A) is characterized by the following two properties: 


(i) d(x, A) < d(x, a) forallae A 
(ii) For all reala > d(x, A), there is an element a € A suchthata > d(x, a). 


Proposition 4.3.1 Let (V, <>) be a finite dimensional inner product space. Let W 
be a subspace of V and {x,, X2,...,X,} an orthonormal basis of W. Let x € V. Then 
the shortest distance between x and W is 


Jllel2 — Bhi la >. 


4.3, Orthogonal Projection, Shortest Distance 113 
Proof Since X7_, <x, xj > x; € W, and 
r 2, 2 2 
[|x — Bip <x, x > % II =|) x IP — By |< x, i SV, 


it follows that the shortest distance between x and W is at least 


lla i? — Bhyl<x, x >P. 


Further, given any y = X7_,a;x; in W, 


lx — yIP=I)e1P +B lor? — Bye <x, > — Dasa, a>. 


Since a + a@ <2] a | for all complex number a, using Cauchy inequality, we get 
DiGi <x, x) > + Di_yaj<x, > <2| Vi_jai<x, x > | 


< 2/51, lai 2 Bh, lex, i >P. 


Hence 


2) 2 2 
lx — yl =I x ll — By lex, mi >I. 


Thus, ,/|| x ||? — DI, |< x, x; >[? is the shortest distance between 
x and W. tt 


Proposition 4.3.2 Under the hypothesis of the Proposition 4.3.1, Ui_) <x, Xj > Xi 
is the unique point of W which is at the shortest distance from x. 


Proof It follows from the proof of the Proposition 4.3.1 that the said point is at the 
shortest distance from W. Suppose that X7_,a;x; is a point of W which is also at the 
shortest distance from x. Then 

|x — Byaux; |? =|] x |? — BL, lex, a oP, 
and so 


iy | Qi [- + dy, |< x, x; >= bj j0; <x, x; > + ULyaj<x, A> .... 
(4.1) 


Consider the usual standard complex inner product on C", and points u = (aj, Q2, 
...,Q@,-) andv = (<x, xX) >,<X, Xo >,...,< xX, x, >) of C”, then the above 


equation means that 


[eve 1]? 4 || vu |P=<u, v> + <v,u>. 


114 4 Inner Product Spaces 


Hence 

<u,u>+<v,v>=<u,vu>+ <v,U>. 
This shows that < u—v, u—v >= O0,andsou = v. Thus,a; = < x, x; > for 
alli. tt 


Let V be a vector space. The translates of one dimensional subspaces are called 
lines or affine lines in V. Thus, a line in V is of the form {a + AD | A € F}. This line 
is the line passing through a and parallel to b (or to the subspace {Ab | \ € F}). Trans- 
lates of subspaces of dimension greater than 1 are called planes or affine planes. 
The subset a + W, where W is a subspace of dimensionr > 1| is called a plane of 
dimension r passing through a and parallel to W. If dimW = dimV — 1, then it 
is said to be a hyperplane. 


Corollary 4.3.3. Let (V, < >) be a finite dimensional inner product space. Let W 
be a subspace of V with {x1, x2, ..., X,} an orthonormal basis. Then the distance of 
a point a € V from the affine plane b + W (or affine line ifr = 1) is same as that 
of a — b from W, and it is 


Jlla—b |? — 2h, |<a—b, x >P. 


The line of shortest distance from a to b + W is the same as perpendicular from a 
tob + W, and it is 


{a+ A((a — b) — Xi_, < a—b, x; > xj) | XA € F}. 


The foot of perpendicular from a to b + W isb+ Xi_, <a—)b, x; > xj. 


Proof The shortest distance of a from b+ W is g.l.b{||a—(b+w) || |wewW} 
which is the same as g./.b{||a—b—w || | w € W}. Thus, the shortest distance 
from a to b + W is same as the shortest distance between a — b and W. From the 
above proposition, it is 


Jlla—blP? — Bh, l<a—6, x >P. 
Since a —b— X_, < a—b, x; > x; is orthogonal to each x;, it is also orthogonal 
to each member of W. Thus, the line joining a and b+ X7_, <a—b, x; > x; 1s 
orthogonal to W. Hence, the line passing through a and perpendicular to b + W is 
given by 

{a+ \a-—b— Ui, <a—b, x; > x;)|X€ F}. 


Clearly, b+ X/_, <a—b, x; > x; is the foot of perpendicular from a on to 
b+ W. q 


Proposition 4.3.4 Let (V, < >) be an inner product space. Let W be a subspace of 
V. ThenWt = {ve V |<x, v>= Oforall x € W} is a subspace of V. 


4.3 Orthogonal Projection, Shortest Distance 115 


Proof Clearly 0 €¢ W+, and so Wt # @. Let y,z € W+. Then < x, y>= 0 =< 
x, 2 > forall x € W. But, then<x, ay+ 86z>=Q@<x,y>+6z<x, 72> 
= 0. This shows that ay + 3z € Wt. It follows that W+ is a subspace. tt 


Definition 4.3.5 The subspace W+ defined in the above proposition is called the 
orthogonal complement of W. 


Proposition 4.3.6 Let V be a finite dimensional inner product space, and W a 
subspace of V. Then V is the direct sum of W and Wt. 


Proof Let {x,, x2, ...,x,} be an orthonormal basis of W. Then this being an ortho- 
normal set can be enlarged to an orthonormal basis {x1, x2,...,X-,Xr+1,---,Xn} of 
V,wheren = dimV. Anelement x = a,x; + Q2X2 +--+ + Q;X, + Qy4 trai + 
-++Q,Xy is orthogonal to W if and only if it is orthogonal to each x;, i <r. This 
is so if and only if a; = <x, x; > = 0 Vi <r. This shows that Wi=< 
{X41 X-42,--++,X,} >. Since every element x € V can be uniquely expressed as 
X = xX, + 2X2 +e + Op Xp + pei Xr + OpgoXpe2 $+: + AnXn, it follows 
that every element x of V has a unique representation as x = y+ z, where y ec W 
andze Wt. tt 


Remark 4.3.7 Let (V, <, >) be a finite-dimensional inner product space, and W 
a subspace of V. Suppose that x = y+ z, where y € W and z € W~. Then y is 
called the component of x along W, and z is called the component of x orthogonal 
to W. Clearly, y is the foot of perpendicular from x to W. 


Definition 4.3.8 Each subspace W of an inner product space V defines the map Pw 
from V to V given by Pw(x) = y, where y is the foot of perpendicular from x to 
W. The map Py is a linear transformation called the orthogonal projection of V 
on to W. 


Example 4.3.9 Consider the subspace W = {X = (x1, %2,%3) | X1 t22 +23 = 
O}. The subset S = {(1,—1,0), (0, 1, —1)} is a basis of W (verify). Using Gram 
Schmidt process, we get an orthonormal basis {Fy vo 0), ( Fe "ES of 
W. There fore, the foot of the perpendicular from a pointx = (x1, x2, x3) onto W is 


= 1 1 1 1 = 1 1 2 1 1 2 
49, (4-1 s(4 eer Ca ae Ca) 


= (BeBe stiees | =-2t2s) The matrix of the orthogonal projection Py 
is given by 
1 2-1-1 
ee ee ee 
—-1-1 2 


Remark 4.3.10 The result of the above proposition is not true for infinite-dimensional 
inner product space. For example, consider the vector space C[0, 1] of real-valued 
continuous functions from the closed interval [0, 1] with inner product given by 


1 
<fig>= | f@g@)dx. 


116 4 Inner Product Spaces 


Let W be the subspace of polynomial functions. Then W is a proper subspace, and 
Wt = {0} (prove it). Thus, C[0, 1] is not direct sum of W and wt. 


Let (V, < >) be an inner product space. Let y € V. It follows from the definition 
of the inner product that the map f, from V to F defined by f,(x) =< x, y >isa 
linear functional on V, and the map f” defined by f”(x) = < y, x > is anti-linear 
functional in the sense that f’(ax + Gz) = af?(x)+ B Ff? (2). Observe that the 
set of all anti-linear functionals also form a vector space anti isomorphic to the dual 
space V* of V. 


Proposition 4.3.11 Let (V, < >) be a finite-dimensional inner product space. Then 
the map y ~» fy is an anti-linear isomorphism from V to V*. Also the map f ~ f* 
is an isomorphism from V to the space of all anti-linear functionals on V. 


Proof That the map y ~» f, is an anti-linear transformation follows from the def- 


inition of inner product. We show that it is injective. Suppose that f, = f,. Then 
Sy) = f(x) for all x. Hence <x, y > = <x, z > forall x. This means that 
<x, y-—z>= 0 for all x. In particular, < y—z, y—z> = O. This implies 
that y = z. Next, it is easy to observe that an anti-linear transformation takes a 


subspace to a subspace, and an injective anti-linear transformation takes a linearly 
independent subset to a linearly independent subset. Thus, the image of V under 
the injective anti-linear transformation y ~» f, is a subspace of V* of dimension 
equal to the dimension of V. Since dimV = dimV*, it follows that y ~~ f, is also 
surjective. The rest of the proposition follows similarly. 


The following corollary is a restatement of the bijectivity of the map y ~ fy. 


Corollary 4.3.12 Let (V, < >) be a finite-dimensional inner product space, and f 
a linear functional on V. Then there is a unique y € V such that f(x) =< x, y> 
forallx eV. t 


Remark 4.3.13 The result of the Proposition 4.3.11, and the Corollary 4.3.12 is 
not true, in general, for an infinite dimensional inner product spaces. Consider, for 
example, the space P[0, 1] of polynomial functions on [0, 1]. We have an inner 
product on this space defined by 


1 
<f,g>= [ S(xyg(x)dx. 


Consider the linear functional ¢ on P[0, 1] defined by #(f) = f(1). Check that 
there is no g in P[0, 1] such that 


1 
i f(x)g@)dx = fd) 


for all f € P[O, 1]. 


4.3, Orthogonal Projection, Shortest Distance 117 


Adjoint of a Linear Transformation 


Let (V, < >) be a finite dimensional inner product space. Let T from V to V be a 
linear transformation. The map x ~»< T(x), y > isa linear functional on V (verify) 
for each y € V. Hence from the Corollary 4.3.12, there is a unique element in V 
which we denote by 7*(y) such that 


<T(x), y>=<x, T*(y) > 

for each x € V. This defines a map 7* from V to V given by the equation 
<T(x), y>=<x, T*(y)>. 

Using the defining property of an inner product, we see that 


<x, T*(ay + Bz) >=< T(x), ay+ bz >= O< T(x), y> +h <T(x), 2> 
=a<x, T*(y)>4+6 <x, T*(2) >= <x, aT*(y) + BT*(z) > 


for each x € V. Thus, 
<x, T*(ay + 6z) — (aT*(y) + BT*(z)) > = 0 
for all x € V. Putting x = T*(ay + Bz) — aT*(y) — BT*(z), we get that 
T*(ay + Bz) = aT*(y) + BT*(z) 


for all y, z € V, anda, @ € F. This shows that T* is a linear transformation. 


Definition 4.3.14 The linear transformation defined above is called the adjoint of 
T. 


Proposition 4.3.15 Let (V, <>) be a finite dimensional inner product space, and 
T a linear transformation from from V to V. Let 

B = {x1,X2,...,%Xn,} be an orthonormal basis of V. Let M(T) denote the matrix of 
T with respect to the orthonormal basis B. Then M(T*) = (M(T))* (the tranjugate 
of the matrix M(T)). Further, if M(T,) = (M(T)))*, then Ty = Ty. 


Proof Suppose that M(T) = [a;j]and M(T*) = [bj)]. Then T(x;) = ey AiXk 
and T*(2,) = >), bya Ths, a == > tet, 2 PS = TO), ap SS = 
xi, T*(xj) > = <x, Ly byjx1 > = ij. This shows that M(T*) = (M(T))*. 
Suppose that M(7T>) = (M(T,))*. Then M(7,) = M(T7). Since the matrix repre- 
sentation map with respect to a fixed basis is bijective, the result follows. tt 


Proposition 4.3.16 Let (V, < >) be a finite-dimensional inner product space. The 
map 1 defined by n(T) = T* from EndV to EndV is an anti isomorphism of 
algebras which is an involution in the sense that 7? = Iy. 


118 4 Inner Product Spaces 


Proof Since <x, (aT; + BT)*(y) >=< (aT, + GIh)(x), y> = a<T (a), 
y>t+6<Th(x), y>= a<x, Thy) >4+8 <x, Ty) > = <x, (ATi + 
BT3)(y) > for all x € V, it follows that (a7 + BT3)(y) = AT} (y) + BTF(y) for 
each y € V. Hence (aT, + GT))* = aTy + Or. Further, < x, (TjoT>)*(y) >= 
< TjoT)(x), y >=< Th(x), Th (y) > = < x, TYTS(y) > forall x, y e V. Hence 
(T\oT))* = TyoTf. Also <x, T(y)>= <T(y), x > = <y, Tt(x) > =< 
T*(x), y>=<-x, (T*)*(y) > forall x.y e V. Hence T = (7%*)*. It is clear that 


n°’ = Iy,and so 7 is bijective. t 


Definition 4.3.17 Let (V, <>) be a complex inner product space. A linear trans- 
formation T from V to V is called a 


(i) self adjoint or Hermitian linear transformation if T* = T,i.e.,< T(x), y > 
=<x, T(y)> forallx,yeV. 


(ii) skew Hermitian linear transformation if T* = —T,ie., < T(x), y>= 
— <x, T(y) > forallx,yeV. 
(iii) unitary linear transformation if T* = T~', or equivalently < T(x), T(y) > 


=<x, y>forallx,yeV. 
(iv) normal linear transformationif T*T = TT*, orequivalently, < T(x), T(y) > 
= <T*(x), T*(y) > forallx,y eV. 


It is clear from the definition that every Hermitian linear transformation as well 
as every skew Hermitian linear transformation is normal. Also every unitary linear 
transformation is normal. 

The following corollary is immediate from Proposition 4.3.15. 


Corollary 4.3.18 A linear transformation T on an inner product space V is Her- 
mitian (skew-Hermitian, unitary or normal) if and only if the matrix representation 
M(T) of T relative to an orthonormal basis is Hermitian (skew-Hermitian, unitary, 
or normal respectively). tt 


Example 4.3.19 Consider the usual complex inner product space C”. Define a map T 
from C? toitself by T((x, y)) = (x + iy, —ix+y), x, y € C. Then T is Hermitian. 
Indeed, the matrix of T relative to the standard orthonormal basis {@7, @3} is 


li 

—il 
which is clearly a Hermitian matrix. T is not unitary as the matrix is not unitary. The 
linear transformation U = A is unitary (check for the corresponding matrix). The 
linear transformation (x, y) ~» (ix, y) from C* to itself is unitary (check), but it is 


not Hermitian (verify). The linear transformation (x, y) ~» (ix, 2y) is normal, but it 
is neither Hermitian nor unitary (check). 


Let (V, < >) be a real inner product space. A linear transformation T from V to 
V is called a 


4.3, Orthogonal Projection, Shortest Distance 119 


(i) real symmetric (real skew symmetric) if T* = T(T* = —T). 
(ii) orthogonal if 7* = T™!. 


Example 4.3.20 The linear transformation Ty from R? to R? defined by Tp((x, y)) = 
(xcos@ + ysin@, —xsin@ + ycos@) is an orthogonal linear transformation(verify). 


Let H(V)(SH(V)) denote the set of all Hermitian (skew-Hermitian) linear trans- 
formation on a complex inner product spaces (V, < >). IfT € H(V)(\ SH(V), then 
T = T* = —T. This is equivalent to say that T = {0}. Thus H(V)(|)SH(V) = 
{0}. If 7; and 75 are in H(V) (SA(V)), then (7) + T))* = TRETS = Tt 
To(—(T; + T>)). This shows that H(V) and SH(V) are subgroups of End(V), and 
their intersection is zero. Further, if T is any Hermitian (skew) linear transformation, 
and a@ a complex number, then aT is Hermitian (skew-Hermitian) if and only if a is 
purely real (imaginary). Thus, H(V) and SH(V) are not subspaces. Also T is Her- 
mitian if and only if i7 is skew-Hermitian. The map 7 ~~ iT defines an isomorphism 
from the group H(V) to SH(V). Given 7;, 77 € H(V), T,T) € H(V) if and only 
if TT) = (MT2)* = TTF = TT\. Similarly, given 7}, T7 ¢ SH(V), TT € 
SH(V) if and only if 7} 7, = —7,T,. Let T be any endomorphism of V. Then 
(4°) is Hermitian for 


(ay _— T*+(T*)* _— T+T* 
2 2 2 


To summarize, we have proved the following proposition. 


Proposition 4.3.21 Let (V, <>) be a finite-dimensional complex inner product 
space. Then the group EndV is direct sum of its subgroups H(V) and SH(V). 
The subgroups H(V) and SH(V) are not subspaces. They are isomorphic as groups 
under the map T ~ iT. H(V) and SH(V) are not closed under product. Indeed, 
T,, T, € H(V) implies that T|T, € H(V) ifand only if T,T, = T2T,. Also T,, Tz € 
SH(V) implies that T,T, € SH(V) ifand only if T,T, = —T,T,. tt 


The following two propositions follow immediately from the corresponding result 
in matrices provided we observe that the matrix representation map M relative to an 
orthonormal basis is an isomorphism from End(V) to M,(R) which maps S(V) to 
S,(R) and SS(V) to SS,(R). 


Proposition 4.3.22 Let (V,<>) be a real inner product space. Then the set 
S(V)(SS(V)) of symmetric (skew symmetric) linear transformations forms sub- 
spaces of End(V), and End(V) is direct sum of these subspaces. Further, product of 
any two symmetric (skew-symmetric) linear transformations A and B is symmetric 
(skew-symmetric) if and only if AB = BA(AB = —BA). tt 


Proposition 4.3.23 Let (V, < >) be areal inner product space of dimension n. Then 
the dimension of S(V) is wie and that of SS(V) is ">". tt 


120 4 Inner Product Spaces 


4.4 Isometries and Rigid Motions 


Definition 4.4.1 Let (X, d) be a metric space. A bijective map f from X to itself is 
called an isometry of (X, d) if d(f(x), f(y)) = dQ, y) forall x,y € X. 


The set of all isometries of (X, d) is denoted by Jso(X), and it is clearly a group 
under composition of maps. This group is a subgroup of Sym(X). If (V, < >) is an 
inner product space, then it is already equipped with a metric induced by the inner 
product. We shall try to describe the isometries of V with respect to the induced 
metric, and also its group Tso(V) of isometries. 


Theorem 4.4.2 Let (V, < >) be a finite dimensional complex inner product space. 
Let T be a linear transformation from V to V. Then the following conditions are 
equivalent. 


I. T is an isometry of V, i.e. || T(x) — TQy) ||=||x — yl] forall x,yeV. 
2. || T(x) || =|| x || forallx eV. 

3. T is aunitary linear transformation. 

es a ia 

Proof 1 ==> 2. Assume 1. Then || T(x) || = || T(@~) — T() || 
=(||x-O||/=||x || forallx eV. 


2 => 3. Assume 2. We have the following polarization identity (Proposition 
4.1.23) 


ae 24: wD . 2 2 
<x,y>= alla + yl +i |[xt+iy ll -A+)dl «IF + lly 1] 


for all x, y € V. Thus, 
< T(x),TQy) >= 
sll T@)+7O) I? +2 || T@)+iTO) I? —A+H)C T@) IP +1 TO) 191 
for all x, y € V. Since T is a linear transformation, we have 
< T(x),TQY) >= 
To ty) IP FEN TO +i) IP — A400 TRIP F170) IPI 
for all x, y € V. Using 2, we see that 
<T(x),TQy) >= 
‘x+y I? +i llxtiv IP —A+ddl2 IP + lly ID] =<2,y> 
for all x, y € V. This shows that T is a unitary linear transformation. 
3 => 4. Assume 3. Then 
<x,y>=<T(x),TVy) >=<x,T*(TO)) > 
for all x, y ¢ V. Hence <x, y—T*T(y) > = O for all x,y € V. Putting x = 
y — T*(T()), we get that 7*(T(y)) = y forall y e V. This shows that T*T = Ty. 
Since V is finite dimensional TT* = Jy. This means that T* = T7!. 
4 => 1. Assume 4. Then T*T = Iy, andso < T(x), T(y) > 
=<x,T*T(y) > = <x,y > for all x, y € V. In particular, < T(x), T(x) > = 
<x,x > for all x e V. This means that || T(x) || = || x || for all x € V. Since 


4.4 Isometries and Rigid Motions 121 


T is a linear transformation || T(x) — T(y) || = || T( — y) || = || x — y || for all 
x,yeVv. f 
Corollary 4.4.3 GL(V)(\Iso(V) = U(V). tt 


Proposition 4.4.4 Let (V,, < > ) and (V2, < >2) be complex inner product spaces. 
Let f be a vector space isomorphism from V, to V2 which preserves inner product. 
Then f induces an isomorphism n(f) from U(V,) to U(V2) defined by n(f)(T) = 
forof-’. 


Proof It is clear that 7(f) defined above is an isomorphism from GL(V;) to 
GL(V3). It is sufficient, therefore, to prove that if f preserves inner product then 
n(f) takes U(V,) onto U(V2). Suppose that f preserves inner product. Then 
<x,y>=< f(f-'()), ff '0)) > =< f-'(x), f7'() >. This shows that f 
preserves inner product if and only if f—! preserves inner product. It is also immedi- 
ate that composition of inner product preserving maps are inner product preserving. 
Hence, if f is inner product preserving, and g an isomorphism, then fog (gof) is 
inner product preserving if and only if g = f~!o( fog) is inner product preserving. 
We know that T € U(YV;,) if and only if T is an isomorphism which is inner prod- 
uct preserving. It follows that T € U(V;) if and only if foTof~! is inner product 
preserving. Thus, 7(f) induces an isomorphism from U(V,) to U(V2). tt 


Proposition 4.4.5 Any two n-dimensional complex inner product spaces are iso- 
morphic as inner product spaces, i.e., there is an isomorphism between them which 
preserve inner product. 


Proof Let (Vi, < >1) and (V2, < >) be two complex inner product spaces each of 
dimension n. Let {x,, x2, ..., X,} be an orthonormal basis of V;, and {y;, yo, ..., Yn} 
be that of V2. Then there is an isomorphism f from V; to V2 which takes x; to y;. 
But, then 


< Diy aixi, Vy Six > = ViLyaihi = < VLyaiyi, Vy iyi > - 
This shows that f preserves inner product. tt 


Corollary 4.4.6 Every n-dimensional complex inner product space is isomorphic 
as inner product space to the standard complex inner product space C". tt 


Thus, if V is an-dimensional complex inner product space, then the group U(V) 
is isomorphic to U(C”). The group U(C") is denoted by U(n), and it is called the 
unitary group on n-dimensional inner product space. 


Proposition 4.4.7 Let (V, <>) be a complex inner product space, and H a linear 
transformation from V to itself. Then H is a Hermitian linear transformation if and 
only if < H(x), x > is real forall x € V. 


122 4 Inner Product Spaces 
Proof Suppose that H is Hermitian. Then 
< H(x), x >=<x, H*(x)>=<x, H(x)>= < A(x), x>. 
Hence < H(x), x > is real for all x € V. Conversely, suppose that 
< H(x), x > is real for all x e V. Then < H(x+ y), (x+y) > is real for all 
x,y € V. Expanding, and noting that < H(x), x > and < H(y), y > are real, we 
find that 
< A(x), y> + < A(y), x > isrealforallx,y eV. 
Again expanding < H(x +iy), (x +iy) >, we get that 
< H(x), y> — < H(y), x > is purely imaginary for all x, y € V. 


It is an elementary fact that if z; and zz are two complex numbers such that z; + 


Z2 is real and z} — 2» is purely imaginary, then z; = Z 2. This shows that < 
A(x), y>= < HQ), x > =<x, A(y) > forallx, y € V. This shows that H is 
Hermitian. tt 


Corollary 4.4.8 Let H be a Hermitian linear transformation from a finite dimen- 
sional complex inner product space (V, < >) to itself. Then I — iH andI + iH 
are isomorphisms. Also (I + iH) — iH)! is a unitary linear transformation. 


Proof Suppose that 7 — iHf)(x) = 0. Thenx = iA(x), andso<x, x >= 
i < H(x), x > is real. Since <x, x > is real, and from the above proposition 
< H(x), x > 1isalso real, it follows that < x, x > = 0,andsox = O. This shows 
that J — iH is an injective linear transformation from V to itself. Since V is finite 
dimensional, it follows that ] — iH is anisomorphism. Similarly, J + iH is also an 
isomorphism. Further, since (T}07T>)* = TyoT¥, and (T~!)* = (T*)~', we have 


(7 +imjd — in)~')* = (d¢ — iny)~!U + in) 
= (1 +iH)'U — ix) = 7 —if)d + if)! = (Ud + im)d — ix)!)“!. 


(Note that (I — iH) and (I + iH) commute, and so 7 — iH and (J + iH)! 
also commute.) This shows that (7 + iH)(I — iH)~! is unitary. tt 


Rigid Motion 


Definition 4.4.9 Let (V, < >) be a finite-dimensional real inner product space. Let 
d be the metric induced by the inner product. An isometry of (V, d) is also called a 
rigid motion on V. Thus, T is a rigid motion if 


I T@) -— TO) Il=llx - yIl 


forallx,y eV. 


4.4 Isometries and Rigid Motions 123 


The group /so(V) of all rigid motions is called the group of rigid motions on V, 
and it is denoted by M(V). 

As in case of complex inner product space, an inner product preserving isomor- 
phism from a real inner product space V; to a real inner product space V2 induces an 
isomorphism from M(V,) to M(V2). Thus, the group of motion on an n-dimensional 
real inner product space V is isomorphic to M (R”). This group is called the group 
of Euclidean motions. 


Theorem 4.4.10 Let (V, < >) be a finite dimensional real inner product space. Let 
T be amap from V to V. Then the following conditions are equivalent. 


1. T is arigid motion which fixes origin 0. 

2. T preserves inner product, i.e, < T(x), T(y) >= <x, y>forallx,y€eV. 
This is equivalent to say that T preserves angle between vectors. 

3. T is an orthogonal linear transformation. 

4, T is alinear transformation such that T* = T~!. 

5. T is a linear transformation which preserves lengths of vectors, i.e., || T(x) || 
= || x || forallx eV. 


Proof | => 2. Assume 1. Then 
TQ) =I T@) — Ol = T@) — TO =lle — Oll=llell 


for all x € V. Also, then 
[Pe ie FP 22FG). TO) = 
= || T@) IP + ITO) I? —2<T@), TO) > 
=||T@) — TO)IP 
=||x — yIP 
=|[x 1? + lly ll? -2<x, y> 
for all x, y € V. Hence < T(x), T(y) >= <x, y>forallx,yeV. 
2 => 3. Assume 2. It is sufficient to prove that T is a linear transformation. Using 
the fact that T preserves the inner product, we see that 


2TK+9-T@) -TO), Te+9)-TR)-TO)> 
=<x+y-x-y,x+ty-x-y>=0 

for all x, y € V. This shows that T(x + y) = T(x) + T(y) forall x, y € V. Simi- 

larly, it can be shown that T(ax) = aT (x) for alla € Randx € V. 


The proof of 3 ==> 4 is similar to the proof of 3 => 4 in the Theorem 4.4.2. 
4 => 5. Assume 4. Then T* = T~!. Hence 


\| T(x) |P =< TT), T(x) >= <x, T*T(xX) >= <x, x>=|/x ||" 


for all x € V. 
5 => 1. Assume 5. Since T is a linear transformation, 7(0) = 0 and 


124 4 Inner Product Spaces 


Il Tx) -— Ty) |}=lT@—y) Il=llx-y 
for allx,y € V. t 


Remark 4.4.11 The analogue of | ==> 2 is not valid in complex case. For example 
the map T from the usual complex inner product space C” to itself defined by 


T((Z1, £25 Soceg Zn)) = (Zi, 22> Sie’, Zn) 
preserves distance, fixes origin, but does not preserve inner product. 


Remark 4.4.12 A length preserving map from a real inner product space to itself 
need not be an orthogonal transformation. Indeed, the translation map Lz from the 
usual inner product space R” to itself defined by 


Lax) = x + a, 
a # 0 preserves length but it is not a linear transformation. 


Let (V, < >) bea real inner product space and a € V. The map L, from V to V 
defined by L,(x) = x + ais arigid motion. This is called the translation by a. The 
set S(V) of all translations on V is a subgroup of the group M(V) of rigid motions 
which is isomorphic to (V, +) (the mapa ~» L, is an isomorphism). Also O(V) the 
group of orthogonal linear transformations is a subgroup of M(V). 


Theorem 4.4.13 Every rigid motion of a finite dimensional real inner product space 
(V, <>) is uniquely expressible as a product of a translation and an orthogonal 
linear transformation. 


Proof Let ¢ € M(V). Let d(0) = a. Then the map T from V to V defined by 
T(x) = (x) — ais also arigid motion such that T(0) = 0. From the previous 


result, T is an orthogonal linear transformation and ¢ = L,oT. Further, suppose 
that L,oT = Lp,oT', where T and T’ are orthogonal linear transformations. Then 
a = L,oT(0) = L,oT(O) = b. This shows thata = bandT = T’. t 


Recall that a group G is said to be (internal) semi-direct product of a normal 
subgroup H by K if 


Gi) G = HK, and 
(ii) H() K = {e}. 


In this case every element g of G has a unique representation as g = hk, where 
ke Kandhe dH. 


Corollary 4.4.14 M(V) is semi-direct product of 3(V) by O(V). 


Proof It follows from the above result that M(V) = S(V)O(V). Also if Lg € 
O(V), thena = L,(0) = 0, and so Lg = Iy. Thus, 3(V)() O(V) = {ly}. 


4.4 Isometries and Rigid Motions 125 


Now, we show that 3(V) is a normal subgroup of M(V). Let LzoT € M(V) and 
Lp € S(V), where T € O(V). Then 


(LgoT) 'oLpo(LaoT)(x) = x + T7'(b) = Lr-1@y(x). 


Hence (L,0T)~'oL,o(LaoT) = Lr-m) € S(V). tt 
Following corollary follows from the second isomorphism theorem. 

Corollary 4.4.15 M(V)/3(V) is isomorphic to O(V). tt 

Exercises 


4.4.1 Define a map < >’ from R? x R? to R by 
< (41,%2,%3), (V5 ¥2s Y3) > = X11 + X2Y1 + X1Y2 + 2xgy2 + 3x3zy2 + 3x2y3 + x3Y3. 


Show that < >’ is an inner product on R?. Deduce that 

| xiyi + X21 + X1y2 + 2x2y2 + 3x3y2 + 3x23 + x33 | 

< J + 2x,x2 + Dx? + 6x2x3 + xy? + 2y,;y2 + 246 + 6y2y3 + x for all real 
numbers x;, y;. Further, find an orthonormal basis of (R?, <>’), and a linear trans- 
formation T from R? to itself such that 


< T(x), T(y) ya< Xx, y> 


for all x, y € R?, where < > is the standard Euclidean inner product. 


4.4.2 Define a map < >’ from R* x R? to R by 


< (1,2), (Y1, 2) > = x1 + x1 yo + 2x21 + Yi y0. 


Is < >’ an inner product? Support. 


4.4.3 Let P3(x) denote the vector space of all polynomials of degree at most 3 with 
coefficients in the field R of real numbers. Let f(x) and g(x) be elements of P3(x). 
Define < f(x), g(x) >= i FS (x)g(x)dx. Show that <, > defines an inner product 
on P3(x). Find an orthonormal basis of P3(x). 


4.4.4 Let V be the vector space of all real valued smooth functions on R and 
T = D? — 6D? + 11D - 61 


where D = a is the standard differential operator. Show that kerT is 3 dimensional 


inner product space with respect to the inner product 


1 
ag = I F(x)ga)dx. 


126 4 Inner Product Spaces 


Find an orthonormal basis of kerT. 


4.4.5 Let A be anon singular 3 x 3 matrix with real entries. Show, by means of an 
example, that< x, y>= x AY’ need not define an inner product on R?. Show that 
it defines an inner product if and only if A = BB’ for some non singular matrix. 
Such a matrix is called a positive definite symmetric matrix 


4.4.6 Let C[0, 1] denote the vector space of all continuous functions on the closed 
interval [0, 1]. Show that < f(x), g(x) >= i, J (x«)g(x)dx defines an inner prod- 


uct on C[O, 1]. Show that the set {1} U{/2sinnrx, /2cosnnx | n € Z} forms an 
orthonormal set. 


4.4.7 Find the largest value and also the smallest value of 3x — 4y + 2z, if exists, 
on the sphere x7 + y? + z? = 4,andalso on the Ellipsoid x? + 2y? + 3z7 = 1. 
Do they exist on a paraboloid, or a hyperboloid? 


4.4.8 Consider the standard real inner product space R*. Use the Gram—Schmidt 
process to determine an orthonormal basis of the subspace W generated by {(1, 1, 
1, 1), 1, 2, 2, 2), C, 2, 3, 3), C1, 0, 0, 0)}. What is the dimension of W? 


4.4.9 Find a lower triangular matrix P with positive entries, if possible, so that PA 
is an orthogonal matrix, where 


111 
A= 1123 
149 


Also express Aas A = LO(UO), where L is alower (U an upper) triangular matrix 
with positive entries, and O is an orthogonal matrix. 


4.4.10 Let W be a subspace of the standard Euclidean inner product space R* gener- 
ated by {(1, —1, 1, —1), C, 1, 1, —3), (1, 2, —3, 0)}. Find the distance of (1, 1, 1, 1) 
from W, and also the foot of the perpendicular from (1, 1, 1, 1) to W. 


4.4.11 Consider the usual inner product space R?. Find the shortest distance between 
the line 

Ke yr-2 2 

1 2 1’ 


and 
x-—1 y—-2 z—3 


1 1 1? 


and also find the line of shortest distance, if it exists. 


4.4.12 In the standard inner product space R*, find shortest distance between 
{(x,y,z,w)|x+y+z+w=0=x-—y=z-w} and {(x,y,z,w)|x+z4+ 
1 = O}. 


4.4 Isometries and Rigid Motions 127 


4.4.13 Consider the subspace W = {(x, y,z,w)|x+2y+z+w=0=x+4+ 
y — 2z = w} of the standard inner product space R*. Find an orthonormal basis 
of W, and also of W~. Find also the component of (1, 1, 1, 1) along W and perpen- 
dicular to W. 


4.4.14 Let T be the linear transformation from the standard inner product space R? 
to itself defined by 


TG.9;2)) = @FYY+e.24+%)s 
Find 7%, and also find its matrix representation with respect to the standard basis. 


4.4.15 Let W be a subspace of a finite-dimensional inner product space V, and 
x € V be such that< x, y> + <y, x ><< y, y> forall y € W. Show that 
ew: 


4.4.16 Let T be a normal linear transformation from V to V such that T(x) = 0. 
Show that T*(x) = 0. 


4.4.17 Let A be a bounded subset of a fine-dimensional real inner product space V 
in the sense that there is a real number M such that || x ||< M for all x € A. Show 
that {f ¢€ M(V) | f(A) C A} is a subgroup of O(V). 


4.4.18 Let T € End(V), where V is a complex inner product space. Suppose that 
T*T(x) = 0. Show that T(x) = 0. 


4.4.19 Let T be a Hermitian linear transformation on a finite dimensional complex 
inner product space V. Let x € V be such that T(x) = 0 for some m > 1. Show 
that T(x) = 0. 


4.4.20 Let V be a real inner product space of dimension n, and {x;, x2,...,X,} an 
orthonormal basis. Define a map 7) from S, to O(V) as follows. Let p € S,. Then 
7(p) is the unique orthogonal linear transformation whose effect on the orthonormal 
basis is given by 7(p)(x;) = Xpqy. Show that 7 is an injective homomorphism. 
Deduce that every group of order n is isomorphic to a subgroup of O(V). 


4.4.21 Show that every group of order n is isomorphic to a subgroup of U,,. 
4.4.22 Show that every finite subgroup of M(V) is a subgroup of O(V). 
4.4.23 Find all finite subgroups of O(2). 

4.4.24 Show that there is no proper open subspace of an inner product space. 


4.4.25 Show that every nonempty open subset of an inner product space generates 
the space. 


4.4.26 Show that every unit sphere {x ||| x || = 1} generates the space. 


128 4 Inner Product Spaces 


4.4.27 Show that every subspace of an inner product space is closed. 


4.4.28 Show that every linear transformation from a finite dimensional inner product 
space to an inner product space is continuous. 


4.4.29 Show that inner product map is also continuous. 


4.4.30 Let W be a subspace of a finite dimensional inner product space V. Show 
that V/W is a inner product space with respect to the inner product defined by < 
x+W, y+W>oe= gl.b{<x+tu, y+u>|u,ve Wh. 


4.4.31 Let A be a skew-Hermitian transformation. Show that + Aand/J — A 
are isomorphism. 


4.4.32 Let V bea finite dimensional real inner product space of dimension n. Define 
an inner product < > on End(V) by 


f 
<T, T’ >= %;, 046i, 


where T = Y,jajj;T;; and T’ = %j;/(3;;T;;. Find the distance between the subspace 
S(V) and SS(V). 


4.4.33 Check, if the following transformations from R? to itself are orthogonal 
transformations. 


(i) T (x1, X2, X3) = (x2, X3, X1) 
Gi) T(x, x2,%3) = (Ae, “5, x3) 
Gii) TQx1,22,43) = (52, Sees, aoe) 


(iv) T (x1, X2,%3) = (41 + X2, X3, X1). 
4.4.34 Check, if the following transformations from R° to itself are rigid motions. 


G) T(x, X2,.%3) = (2 +1, x3 4+ 2, x1) 


(ii) T (x, X2, X3) = ore We 3) 
ses xp +x2 i X30 Xy—X2—2: 
Gil) TQ, %2,%3) = e+, as, 7 ae? 


(iv) T (x1, X2,.%3) = (x1 + X2, 3, x1). 


4.4.35 Let V be a real inner product space of dimension n and x € V, x 4 0. Then 
show that P, = {y € V |< x, y > = O} isahyperplane in V. Show that P, = Py 
if and only ifx = ay for some a ¥ 0. Further, show that every hyperplane is of the 
form P, for some x. Determine a bijective correspondence between the set of lines 
and the set of hyperplanes. 


4.4.36 Let P,, be the hyperplane determined by an element x in a real inner product 
space V as described in the above exercise. Let 0, be a linear transformation on V 
which fixes the elements of P, and maps a vector orthogonal to P, to its negative. 
Show that o, is uniquely defined, and it is given by 


4.4 Isometries and Rigid Motions 129 


2<y,x> 
ox(y) = y — ———-x. 

<xX,x> 
This linear transformation is called the reflection about the hyperplane P,. Observe 
that P, = P, ifand only ifo, = oy. Deduce that every reflection is an orthogonal 
linear transformation whose square is identity. Further, find the matrix representation 
of o, with respect to the standard basis. 


4.4.37 Let ® be a finite set of generators of a real inner product space V such that 
0 ¢ ®. Leto € GL(V) be such that 


Gi) o(@) = @ 
(ii) o fixes element wise a hyperplane P 
(iii) o(a@) = —a for some ae ®. 


Show thato = o, and P = P,. 


4.4.38 Let ® be as in the above exercise such that 


Gi) a € Pimplies that —-ae® 
(ii) o,.(®) = Oforallae ® 
Git) =4° € Z forall a, Be ©. 


<a, @ 


Show that ®() <a >= {5a, —}a, a, —a, 2a, —2a} for all a € ©. 


39x In the above exercise, if in addition, ®() < a >= {a, —a}, then ® is called a 

root system. Let ® be a root system and a, 3 € ®. Show that the angle between a 
: H 7 an mw we nw 3n an Sa : . 

and (2 is one of the following: 0, a ee er ee ee Determine the ratio of 

their lengths in each case. 


40 Determine all roots in R?. 


Chapter 5 
Determinants and Forms 


In this chapter, we introduce the concept of determinant in various ways. The invari- 
ant subspaces, the eigen values, the spectral theorem, the geometry of orthogonal 
transformations, and the geometry of bilinear forms also constitute the subject matter 
of this paper. 


5.1 Determinant of a Matrix 


We define determinant det(A) of an x n matrix A by the induction on 7 as follows: 
LetA = [a;] be an x n matrix. Let Aj denote the (n — 1) x (n — 1) submatrix 
obtained by deleting the ith row and the jth column of the matrix. 


Example 5.1.1 For the matrix 


Definition 5.1.2 If A = [a,,]isa1 x 1 matrix, then define det(A) = a,,. Assum- 
ing that the determinant of all m x m matrices, m < _ n has already been defined, 
define the determinant of an x n matrix A by 


© Springer Nature Singapore Pte Ltd. 2017 131 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_5 


132 5 Determinants and Forms 
det(A) = Di, (-1)* andet(An). 


Thus, 


act] | = (-1)!*!ayjdetdy, + (-1)?*aydetAy, = ayyan — apap, 


a21 422 
11 412 413 
det | a2, a22 a3 
431 432 433 


= 1)! der Z| 4 (<1)? apider | ea A: (ai der | 


and 


32 433 32 433 22 423 


= 411 (422433 — 432493) — a2) (412433 — 432413) + 431 (412423 — 422443). 


Theorem 5.1.3 The determinant map det satisfies the following properties: 
(i) det is linear in each row of the matrix. More precisely, 


T\ TL T| 

rT py Tr 

det) _ a ee ee + bdet a 
ar; + br; Ti r; 

Th Th Th 


for each i. 
(ii) If two distinct rows of a matrix A are same, then det(A) = 0. 
(iii) detd,) = 1. 


Proof (i) We prove (i) by the induction on n. Forn = 1, det([aa\,]) = aay, = 
adet([a;;]), and there is nothing to do. Assume that the result is true for matri- 
ces of order less than n. Let A be a matrix of order n. Consider a general term 
(—1)'*!a;,det(A;,) under summation in the definition of det(A). If k i, then aj 
does not depend on kth row, whereas by the induction hypothesis, det(Aj;,) depends 
linearly on kth row. However, ax; depends linearly on kth row, whereas det(Ax1) is 
independent of kth row. This shows det is linear in each row. 

(ii) Suppose that the kth row rz is same as the /th row 7; of A. Suppose thatk < /. If 
i A kandi ¥ I, then two rows of A; will be same, and so by the induction hypothesis 
det(Aj,) = 0. Thus, all terms under summation of the R.H.S. of the expression for 
det(A) are zero except perhaps (—1)'*+!a,;det(Ag,) and (—1)'*!aj,det(Aj)). Hence, 
then, det(A) equals 


5.1 Determinant of a Matrix 133 
(—1)*lagidet(Agi) + (—1)'*!andet(An). 


Observe that a; = aj. Next, Ay, can be obtained from A;; by the interchange of / — 
k — 1consecutive rows. This means that det(Ay,) = (—1)!~*~'det(An). Substituting 
it we see that the det(A) = 0. 

(iii) Finally if A = J, then the only term which appears in the expression for the 
det(I,) is (—1)'*!"det(,_1) = 1. tt 


Remark 5.1.4 Imitating the proof of the above theorem, one can easily observe that 
the associations A +> Di, (—1)'Yaydet(Aj), A Diy (— 1)'Va,det(A) from the 
set of square matrices to the field F also satisfy the three properties listed in the 
theorem. We shall show (see Corollary 5.3.16) that an association from M,, to F 
satisfying the listed 3 properties in the above theorem is unique. As such, it will 
follow that det(A) = X/_,(—1)'Vajdet(Aj) = Ui (—1)'Maydet(Ay). Thus, we 
can expand the determinant from any row, or from any column. In turn, det(A) = 
det(A’). 


Corollary 5.1.5 D'_,(—1)ajdet(Ay) = 0 for alli ¢ k. 


Proof Suppose that i 4 k. Consider the matrix B = [bpg], where byg = apg ifp Ak 
and byg = dig. In other words B is obtained by replacing the kth row of A by the ith 
row. Thus, deleting the kth row and jth column of A is same as doing the same thing 
on B. More precisely, Ay = By. The expression Er (—1)aydet(Ay) becomes 
wey (-— 1)'/bydet(By). Since two distinct rows of B are same, it follows, from the 
above remark, that vy (—Ikt bydet(Byj) = 0. The result follows. tt 
Definition 5.1.6 Let A be a square matrix of order n. Then (—1)'*/det(Aj) is called 
the (i, j) co-factor of A. The matrix A°*’ = [(—1)'/det(A;)] is called the co-factor 
matrix of A. The transpose (A°’)’ of the co-factor matrix is called the adjoint of A, 
and it is denoted by A“. Thus, A“” = [b,], where by = (—1)*'det(Aj). 


Corollary 5.1.7 A.A“ = det(A)I, = Ad -A, 


Proof Suppose that A- A“ = [cj]. Then 
cy = Uy Gixby = Ue yaie(—1/*det(Aj). 


From the above theorem, it follows that c; = 0 if i Aj, and it is detA ifi = j. 
This proves that A - A@% = det(A)I,. It also follows that A‘ - (A) = det(A')I, = 
det(A)I,. Further, it is also clear that (AM)! = (A‘)%. Taking the transpose of the 
equality A- A“ = det(A)I,, we see that (AVA)! = ANAM)! = ANAND = 
det(A‘)I, = det(A)l,. This shows that AY -A = det(A)I,. tt 


Remark 5.1.8 We shall see a little later that det(AB) = detAdetB. Since A -A“Y = 
det(A)I,,, it follows that if det(A) # 0, then det(A“%) = (det(A))""!. 


134 5 Determinants and Forms 


Corollary 5.1.9 A is invertible if and only if det(A) 40, and then Aq! 
(det(A))'A™, 


Proof Suppose that A is invertible, and A~'A = J,. Then 1 = det(A7!-A) = 
detA~! . detA. Hence detA # 0. Conversely, if det(A) 4 0, then since A- A“! = 
det(A)I,, we have A- (det(A))"!A“% = I, t 


The above result gives us another method to find the inverse of a matrix. This we 
illustrate by means of the following example. 


Example 5.1.10 Consider the matrix A given by 


111 
011 
002 
Here 
(—1)!*'Det(Ay)) = 2, (—D!*?Det(Ayn) = 0 = 
(—1)'*3Det(Ai3), (—D)*tDet(A21) = —2, (—1)?**Det(A22) 
= 2, (—1)**3Det(Ao3) = 0 = (—1)**'Det(A31), (—1)3**Det(Ax2) = —1 


and (—1)?+3Det(A33) = 1. 


Thus, the co-factor matrix is 
2 00 
—2 2 0 
0-11 


The adjoint A“, being the transpose of the co-factor matrix, is 


2-2 0 
02 -l 
00 1 


Clearly, the det(A) = 2, and so the inverse of A is 


1 0 


_ 
01 

if 
oo } 


Corollary 5.1.11 (Cramer’s rule) Consider a system of n-linear equations in n 
unknowns given by the matrix equation 


A-X = B, 


where A is invertible matrix. Then x; = Ly (-— 1)itip, Aw for alli. 


J det(A) 


5.1 Determinant of a Matrix 135 


Proof Since A~! = (det(A))~'A™, it follows that X = Tait wor - B. The result 


follows if we equate the rows. ft 


5.2 Permutations 


This is a brief section on permutations with aim to introduce even and odd permu- 
tations. For detail, one may refer to algebra 1. A bijective map p from {1,2,...,} 
to itself is called a permutation on {1,2,...,}. The set S, of all permutations on 
{1,2,...,}is a group with respect to the composition of maps (product of permu- 
tations). We may represent an element p € S, (without any ambiguity) by 


( 1 2: n ) 
PQ) p2)...-p@)° 


Since p is a bijective map, the second row is just the rearrangement (permutation) 
of 1,2,...,n. Thus, any p € S, gives a unique permutation described above. Con- 
versely, if we have a rearrangement of 1, 2,...,n, then it gives rise to a unique 
bijective map from {1, 2, ..., } to itself by putting the rearrangement below 12---n 
as above. For example, ifn = 4, the rearrangement 2314 of 1234 gives rise to a 
bijective map p from {1, 2, 3, 4} to itself given by p(1) = 2, p(2) = 3, p(3) = 1 
and p(4) = 4. In the above notation 


_ (1234 
Pe Vee ay" 


Thus, the members of S, can be viewed as permutations. The product gf of 


permutations 
f= de [2s aoa A ) 
AFM FQ)... -f@ 
and 
> | ne Ae re /) 
an (av GOW vs +) 
is given by 


Fs ( 1 2 aaron n ) 
Y= V\ gf) g(fQ)..-. gf) ) 


Example 5.2.1 Tf 


136 5 Determinants and Forms 


and 
(1234 
I1-\2134 

then 
eae 

p= (4312) 

and 
(1234 
P>-\3421 

Thus, pq # 9p. 


Cycles and Transpositions 


Now, we consider special types of permutations. Consider, for example, the permu- 
tation 
{123456 
Pb) 
p takes | to 2, 2 to 5, 5 to 6, 6 to 4 and 4 to 1. The remaining symbol 3 is fixed. We can 
faithfully represent the permutation p by the row (1 2 5 6 4) with the understanding 


that each symbol goes to the following symbol, the last symbol is mapped to the first 
symbol, and the symbol not appearing in the row is kept fixed. Thus, the permutation 


1234567 
1524763 


can be represented by (2 5 7 3), whereas 
1234 
2143 


Definition 5.2.2. A permutation p € S, is called a cycle of length r > 1 if there 
exists a subset {ij,i2,...,i-} of {1,2,...,m} containing r distinct elements such 
that pli!) = 2, p(2) = 3,...,pG-1) = i, pl) = 1, and pj) = j for all 
J € {i, io, ..., i-}. The cycle p is denoted by (i;i2 -- -i-). A cycle of length 2 is called 
a transposition. Thus, a transposition is represented by (i j) which interchanges i 
and j, and keeps the rest of the symbols fixed. 


cannot be represented in this form. 


Theorem 5.2.3. Every nonidentity permutation can be written as product of disjoint 
cycles. Further, any two representations of a nonidentity permutation as product of 
disjoint cycles are same up to a rearrangement of the cycles. 


5.2 Permutations 137 


Proof Let p be a nonidentity permutation in S,. Then there exists i; such that 
p(i1) € i;. Clearly, all members of the set {i1, pit), p? (ir, ...,;p"(i)} cannot be 
distinct. Hence, there exist r,s; 1 <r <s <n such that p’(@i,) = p*(i,). Thus, 
there exists t, 1 < t <n such that p‘(i,) = i. Let J, be the least positive integer 
such that p'(i;) = i). Given m € Z, by the division algorithm, there exist q,r 
such that m = l1\q + r, where O<r<J,—1. But, then p”@,) = p’(i). 
It is clear from the above observation that the effect of the permutation p on 
the symbols in {i, p(y), p? (1), ---, p! 1G} is the same as that of the cycle 
Cy = & pli) P* (iz) -»-pi-1()). If p = Cj, then there is nothing to do. If 
not, there exists iz ¢ {i, pit), p? (in), wee pl G)} such that p(i2) 4 iz. As before, 
consider the cycle Cp = (iz pir) p (iz) ---p2-l(iy)), where / is the smallest 
positive integer such that p?(i2) = in. Clearly, C; and C» are disjoint cycles. If 
Pp = C,C,, then there is nothing to do. If not, proceed. This process stops after 
finitely many steps giving p as product of disjoint cycles, because the symbols are 
finitely many. 
Finally, we prove the uniqueness. Suppose that p 4 J and 


p=: C, = CGC, 
where C; and C; are disjoint for i A j, and also C, and C; are disjoint for k A /. 


Suppose that p(t) 4 t. Then there exist i, k such that Cj(t) € t, and also C,(t) # t. 
We may assume that C;(t) 4 t and C;(t) € ¢. But, then using the arguments of the 


previous paragraph, we find that C) = Cj}. Canceling C; and C}, using induction, 
and the fact that products of nonidentity disjoint cycles can never be identity, we find 
thatr = s,andC; = C’ for alli. tt 


Remark 5.2.4 The proof of the above theorem is algorithmic, and it gives an algo- 
rithm to express a permutation as product of disjoint cycles. 


Proposition 5.2.5 Every cycle is product of transpositions. 
Proof (iji2+++i,) = Chir) Giir-1) +++ iia). o 


Since every permutation is product of disjoint cycles, we have the following 
corollary. 


Corollary 5.2.6 Every permutation is product of transpositions. tt 


Remark 5.2.7 Representation of a permutation as product of transpositions is not 
unique. For example, 


(1234) = (14)13)12) = (14)(13)02)24)24) = (14)(23)(13). 


Alternating Map 


Let p € S,. Consider the following rational number: 


p(l)—p2) pp)... p()—pim) . p2)-pB) __. p@)=p) ___ ptr—1)-pn) 
1-2 1-3 l-n 2-3 2-—n (n—1l)—n * 


138 5 Determinants and Forms 


The above expression in short is denoted by 


I] pii)—pV) . 
l<i<j<n i-j 


Proposition 5.2.8 [],<;-j<, =f" = +1 forall p € Sp. 


Proof Since p is a permutation, for all pair (k,/),k AJ, there is a unique pair 
(i,j),i Aj with p@) = k and pg) = I. If i<j, then k —/ appears once and 
only once in the numerator of the expression, and if j < i, then / — k appears once 
and only once in the numerator of the expression. Also k — / or / — k appears once 
and only once in the denominator according as k </ or 1 <k. This proves the 
result. tt 


Definition 5.2.9 The map x from S, to {1, —1} defined by 


_ P@—pPV 
4 es i-j 


is called the alternating map of degree n. 


Theorem 5.2.10 The alternating map x : S, — {1, —1} is a surjective map which 
takes any transposition to —1. Further, it is a homomorphism in the sense that 


x(pq) = x(P)x(q) for all p, g € Sn. 


Proof We first show that x is a homomorphism. 


es pqi)—pqW) p(q@®)—Pg@) q@-qd) _ 
x(Pq) = 15 ee i-j = [beigjen q@—q() hee i-j = 
I Pa@)-PGD) , (q) 
I<i<jsn qi—q@) * X\@ 


Since g is a permutation of 1, 2, ...n, it follows that 


Pq@)-PG@) _ 
pee qid-qy) = x(p). 


This shows that y is a homomorphism. Hence x(p)x(p-'!) = xy) = 1 for all 


permutation p. Consider the transposition 7 = (1, 2). Clearly, 
— 2+1,23,,.2 1-3 1-40 den _ 
MOS tea ea eas oa ee 
Consider a general transposition o = (k,/). Take a permutation p € S, for which 
p(1) = k, p(2) = I. Observe that such a permutation exists. Then prp-! = o. 


Hence y(o) = y(p)x(T)x(p7!) = —1. Thus, x takes any transposition to —1. 


5.2 Permutations 139 
Corollary 5.2.11 Let p € S,. Suppose that 

P = %102°*:O, = 7172°°*Ts, 
where o; and 7; are transpositions. Then r = s(mod2), i.e., 2 divides r — s (equiva- 


lently r and s both are simultaneously even, or both are simultaneously odd). 


Proof From the above theorem, it follows that 


x(P) = x(o1)x(02)-*+ xX(or) = x(T1)xX(T2) +++ X(T). 


Since x takes a transposition to —1, (—1)”" = (—1)*. Hence r — s is even. t 


Remark 5.2.12 From the above corollary, it follows that if we can write a permutation 
as a product of even number of transpositions, then we cannot write it as a product 
of odd number of transpositions, and if we can write it as product of odd number of 
transpositions, then we cannot write it as a product of even number of transpositions. 


Definition 5.2.13 A permutation p is called an even permutation, if it can be 
expressed as product of even number of permutations, or equivalently, y(p) = 1. It 
is said to be an odd permutation if it can be expressed as product of odd number 
of transpositions. We also say that sign(p) = 1, if p is an even permutation, and 
sign(p) = —1 if pis an odd permutation. Thus, y(p) is also written as sign(p). The 
set A, of all even permutations is a subgroup of S,, called the alternating group. 


Example 5.2.14 Consider the permutation p given by 
— {12345 
P= \24513)° 
Then p = (124)(35) = (12)(14)(35) is product of 3 transpositions. Hence p is an 


odd permutation and so sign(p) = y(p) = —1. 


Proposition 5.2.15 Let 7 be a transposition in S,. Then A,tT = {pt | p € An} and 
An are disjoint and S, = An \J Ant. 


Proof Follows from the fact that A,,7 is the set of all odd permutations, whereas A, 
is the set of all even permutations. tt 


5.3 Alternating Forms, Determinant of an Endomorphism 


Let V;, V2,..., V, and W be vector spaces over a field F. A map f from V; x V2 x 
--+ x V, to W is called a multilinear map if 


I Oqs X25 06 Xia 1, AX; + DX, Xin... - 5 XP) 
—= Of (X%1, X25. + X15 Xi Xi ty ++ Xr) + bf (%1, X2, - ++ X15 Xj, Nig, «+. Xp) 


140 5 Determinants and Forms 


for alla, b € F,x; € Vj, and x; € V;. Thus, a multilinear map is a map which is linear 
in each coordinate. If V; = V for each i, then it is called a r-linear map on V. If in 
addition W = F, then it is said to be a r-linear form on V. A 2-linear form on V 
is also called a bilinear form on V 

The vector product on R? is a bilinear map from R* x R? to R?, and the scalar 
product on R?, or in general, an inner product on a real vector space V is a bilinear 
form on V. 


Definition 5.3.1 A r-multilinear map f on a vector space V is called r-alternating 
map if f(x, x2,...,x-) = O whenever x; = x; for some i ¢ j. 


The vector product on R? is a 2—alternating map. The map f from R? x R? to 
R given by f ((a1, a2), (01, b2)) = aybz — ayb, is a 2-alternating form on R?. The 
map (@, b,) + (4x b) - defines 3-alternating form (called the volume form) on 
R?. 

The sum of two r-alternating maps from V to W is a r-alternating map (verify), 
and the scalar multiple of a r-alternating map is also a r-alternating map. Thus, the 
set A,(V, W) of all r-alternating maps form a vector space with respect the above 
operations. 

Next, let T be a linear transformation from V to W. The map T”’ from V" to W” 
defined by T’ (x1, X2,...,X-) = (T(x), T(x2),..., T@,)) is a linear transforma- 
tion. If f is a r-alternating map from W to U, and T a linear transformation from 
V to W, then foT” defines a r-alternating map from V to U(verify). This defines a 
linear transformation A,(7) from A,(W, U) to A,(V, U). The following properties 
can be verified easily. 

1.A-Uy) = Iav,.v)- 

2.A,(T,0T,) = A,;(T\)oA,(T>), where 7; is a linear transformation from W, to 
W, and T> is a linear transformation from W> to W3. 

In particular, A, defines a linear transformation from End(V) to End(A,(V)). 


Proposition 5.3.2 Let f be a r-alternating form on V, and {x, X2,...,X;} alinearly 
dependent set. Then f (x1, X2,...,X-) = 0. 


Proof Under the hypothesis, there is an i such that x; is a linear combination of the 
rest of the coordinates. Substituting this linear combination at the ith coordinate, 
expanding, and using the property of being alternative, we get the result. ft 


Corollary 5.3.3. Let V be a vector space of dimension n. Then the vector space 
A,(V,W) = {0} forallr > n. tt 


5.3 Alternating Forms, Determinant of an Endomorphism 141 
Proposition 5.3.4 Let f be a r-alternating map on V. Then 


FA MI RM es IES 
HF Xe oy HAs Nfs KELL, © y M1 Mi, Mjtty «Hrd 


In other words, if we interchange two coordinates, then the value of f changes its 
sign. 


Proof From the definition of alternating map, it follows that 
SF (X15 X25 0 M1, Xi A Xj Kip dy oy AJL, MEF Aj Kip, --- Xr) = O. 


Expanding, and observing that 


F Xt, X26 6 Ki Mi, Mids oe Mj Mis Mi, Xr) =O = 
FW, Hay ee M1 Mp MELD oo MPD Nfs Ajay» y Hp) 
the result follows. tt 


The above result can be restated as follows. 


Proposition 5.3.5 Let f be a r-alternating map on V, and 7 a transposition in S,. 
Then 


F Xr) Xr (2) 0s Xr) = —f (41, X2,.--, Xr). 4 


In general, we have the following proposition. 


Proposition 5.3.6 Let f is a r-alternating map from V to W, and p € S,. Then 


SF Xp) XpQys +++» Xp) = sign(p)f (X1,%2,-.-, Xr), 


where sign(p) = 1, ifp is even permutation, and it is —| if p is an odd permutation. 


Proof If p = 7172-:+Tm, then applying the above result successively, we see that 


f pa); Xp(2)s +++ Xp(r)) = (—1)"f 1, XQ, 0005 Xp). 
The result follows. tt 
Proposition 5.3.7 Let f be a r-alternating map on V. Let {x,,X2,...,x,} and 


{y1,¥2,..-, yr} be subsets of V. Suppose that y; = Xj_,ajyx;. Then 


FO. V2. - +25 Mr) = Lpes, SIGN(P)A1p 1) 42p(2) °° + Arporyf 115 X25 +++ Xr) = 
r 
Les, sign(p) | [j_1 Gipaaf (1, X25... , Xr). 


Proof f(1,925+++s yr) = fCeL ax, Ui a2Xxj,..., Uj_,airx;). Expanding by 
multilinearity and keeping in mind that the value of f is 0 whenever two arguments 
are same, we see that 


142 5 Determinants and Forms 


FO15 Y25 +++ Mr) = Upes, Apri Ap(2)2 *** Ap(ryrf Apc)» Xp(2)s + + +» Xp) 


Using the above proposition we find that 


. 

FM, y2; A PSD yr) = Xpes, SiGn(P) IL, An if 1, x2, a a xy). t 
Corollary 5.3.8 Let V be a vector space with a basis {x,,xX2,...,%n} and f an- 
alternating form on V. Then f is uniquely determined by its value f (x,, X2,...,Xn) 
on (X1,X2,...,Xp). t 


Theorem 5.3.9 Let V be a vector space of dimension n over a field F. Then the 
dimension of A,(V, F) is 1. 


Proof Let {u1,u2,...,Un} be a basis of V. Then every n-alternating form f is 
uniquely determined by its value f(u1, u2,..., Un) ON (U1, U2, ..., Un). Indeed, given 
(x1,X2,.-.,%,) such that x = DLyayuj, f(x1,X2,...,Xn) = Upes, sign 
(p) Tees Ap nif (U1, U2, ..., Un). This shows that the dimension of A,,(V, F) is at most 
1 (any two n-alternating map differ by a scalar multiple). It is sufficient, therefore, 
to show that A,(V, F) 4 {0}. We show that the map f defined by 


n 
f (X41, .%2,---5%n) = LUpes, sign(p) IL, Ap(i)i> 


where xj = i! ,ajju; is a nonzero n-alternating form on V. Clearly, f(u;, u2,..., 
un) = 1, for then the matrix [aj] is the identity matrix, and so a; = 1 for all 
iandaj; = O fori #j. It is clearly an n-linear map. We show that it is alternat- 
ing. Suppose that xj = x%, j Ak. Then aj = aj for all i. Let p be a permu- 
tation and tT = G k). Then An(r(G))i = Aki = Wwe and An(rik)k = Ap{yk = 
Ap(j)j> and for / ¢ {j, k}, AncrD)l = Wl: It follows that Qp(1)14p2)2 °° * Ap(nyn = 
Apr (1) Apr(2)2*** Aprinyns ANA SO SigN(P)dp(1)14p2)2°**Apinyn = —SIGN(—PT) Apri 


Apr (2)2** * Aprinyn- Now S;, is disjoint union of A, and A,7 = {pT | p € Ay}. Hence, 


bees 819(P) [Tint Apo = 
Lea, Sign(P) [Tin Aoi + Lpripea, Sign(pT) [Tin Aoi =O. 


This shows that f (x1, %2,...,%n) = O whenever x; = x; for somej 4 k. Thus, f is 
alternating. ft 


Let T be an endomorphism of V, where V is a vector space of dimension n 
over a field F. Then T induces a linear transformation A,(T) from A,(V, F) to 
itself. Since A,(V, F) is of dimension 1, it follows that A,,(7) is multiplication by a 
scalar. This scalar is denoted by det(T), and it is called the determinant of T. Thus, 
An(T)(f) = det(T)-f. This defines a map det from End(V) to F, and it is called 
the determinant map on End(V). Since A,(T,oT)) = A,(T2)o0A,(T)), we have the 
following corollary. 


Corollary 5.3.10 det(T,oT,) = det(T,) - det(T,;) = det(T\) - det(T>). tt 


5.3 Alternating Forms, Determinant of an Endomorphism 143 


Corollary 5.3.11 Let T be a linear transformation from a vector space V to itself. 
Let {uy, uz, ..., Un} be a basis of V. Suppose that T(uj) = Xj_,ayuj. Thendet(T) = 
pes, Sign(p) [Ti-1 Apwi- 


Proof Letf be an— alternating form on V. Then by the definition, A,(T)(f) = foT”. 
Now, 
(foT”) (Uj, U2, +--+, Un) = f(T (ui), Tg), .--, TUn)) 


= f(a ayuj, UL anuj, ..., UL, ainui) 
= Lpes,sign(p) []i_1 Apwif (U1), U2, -. +, Un) 
= det(T)f (uy, U2, ..-, Un). 
This shows that 


, n 
det(T) = Xpes, sign(p) IL, Ap(ii« t 


The following corollary is immediate from the above proposition. 


Corollary 5.3.12 detUy) = 1. tt 


Corollary 5.3.13 Let T be a linear transformation from V to V. Then T is invertible 
if and only if det(T) & 0. 


Proof Suppose that T is invertible. Then ToT~! = Iy. Hence 1 = det(ToT~') = 
det(T) - det(T—'). This shows that det(T) 4 0. Conversely, suppose that det(T) 4 0. 
Let {u,, U2, ..., Un} be a basis of V. It is sufficient to show that {T(u,), T(u2),..., 
T(u,)} is linearly independent. Suppose not. Then for any f € A,(V), An(T)(/) 
(Uy, Uo, ..-, Un) =f (TM), Td), .--, TU) =O = det(T)f (uy, u2,..., Uy). 
Since there is af € A,(V) such that f(u1, uo, ..., Un) # 0, it follows that det(T) = 
0. This is a contradiction. t 


Corollary 5.3.14 det(T) = det(T'). 


Proof Let T be a linear transformation on V and {u, u2,...,U,} a basis of V. 
Consider the dual basis {uj, u5,..., uz}. Suppose that T(uj) = Ui_,aju;. Then 
T' (uj) = YU}, byuj7, where bj = aj. Thus, (see the above corollary) 


n n 
det(T') — Xpes, Sign(p) IL, Dyiiyi = Xpes, Sign(p) IL, Gip(i)- 


Since [JL1 dpi = Tey Gip-', sign(p) = sign(p~'), and p ~~ p~! is a bijective 


map on S,, we have 


n n 
Xpes, Sign(P) |e Anni = Upes,Sign(p) IL, Aipii)- 


The result follows. tt 


144 5 Determinants and Forms 


Corollary 5.3.15 Any map f from the set M,(F) of n x n matrices with entries 
in F to F which is linear on each row and which is 0 on matrices with two rows 
same is uniquely determined by its value f(I,) on the identity matrix I,. In fact, 


F(A) = Xhes,sign(p) []_1 apa Un)- 


Proof Take V = F” and realize an x n matrix A as an element (77, 72, ...,7,) of 
V", where 7; = [@j, Gj2,..., Qin] is the ith row of A. Then 7; = aj@; + ay2@7 + 
-+++ Gin€n. Since f is n-alternating, it follows from the above theorem that f(A) = 
S(t, 12, +++5 Fn) 

= Yes, sign(p) [Tz Apwif (Er, €2, ---, En) 

= peS, SiGn(P) Th, Ap iyif Un). tt 


In particular, we have the following corollary. 


Corollary 5.3.16 We have the unique map from the set M,(F) of n x n matrices 
with entries in F to F given by A +> Xpes,sign(p) []i_1 Api Which is linear on each 
row and which is 0 on matrices with two rows same and which is I on I. tt 


The following corollary follows from Theorem 5.1.3 and the above corollary. 
Corollary 5.3.17 det(A) = Xpes, Sign(P) es An (iyi 


Let A = [aj] be an x n matrix with entries in a field F. Then A defines a linear 
transformation L,4 from the vector space F” to itself by 


x} x} 
x2 x2 
I, =A 
Xn Xn 
Let {e), é2,...,é,} be the standard basis of F” (we write the elements of F” as 


columns). Then 
La (e;) = Di aii. 


In other words M¢!:6>-& (4) = A. It follows that 


C1 C2 5 en 


det(La) = Epes,sign(p) |] apwi = det(A). 


To summarize, we list some of the important properties of determinant of matri- 
ces/linear transformations. The first 3 properties are the defining properties, and 
the rest of them are the consequences which are useful in computations and other 
discussions. 

1. detU,) = 1 = det(Irn). 

2. Determinant is a multilinear map on rows/columns of matrices. 

3. Determinant of a matrix is zero whenever two distinct rows/columns are same. 


5.3 Alternating Forms, Determinant of an Endomorphism 145 


4. det(A) = det(A') = Xpes,sign(p) []j_1 aipw- 

The following property of determinant which is a consequence of 3 and 4 is useful 
in computing the determinant of a matrix. See the example below. 

5. Determinant of a matrix does not change if we add a multiple of a row (column) 
in another row (column). 

6. det(A- B) = det(A) - det(B). This follows from the fact that L4., = L,oLp, 
and det(LaoLg) = det(L,) -det(Lg) = det(A) - det(B). 

7. A is invertible if and only if det(A) 4 0. This follows from the facts: (i) A 
is invertible if and only if L, is invertible, and (ii) L,4 is invertible if and only if 
Det(A) = Det(La) £0. 

8. Determinant of a upper(lower) triangular matrix is product of their diagonal 
entries: Let A = [aj] be an upper triangular matrix. Then aj; = Ofori > j. Let 
p €S,. If p is a nonidentity permutation, then p(i) > i for at least one i. Hence, the 
term sign(p) [];_; api = 0 for every nonidentity permutation p. This shows that 
det(A) = [[j_, aii. In particular, determinant of a diagonal matrix is product of the 
diagonal entries. det(al,) = a”. det(E}) =); 

9. Determinant of the permutation matrix A? determined by the permutation p is 
sign(p) (verify). 


Example 5.3.18 Consider the n x n matrix A, = [aj], where aj = min(i, j). For 
example, 
1111 
1222 
BH i a3 
1234 


Subtracting the first row of A, from the rest of the following rows of A, it reduces to 


1 Lixm-1) 
Owm—1)x1 An-1 : 
where 11x(-1) is a 1 x (n— 1) matrix with each entry | and Oq@-—1)x1 represents 


(n — 1) x 1 matrix with each entry 0. Thus, det(A, = det(A,_1). Using induction, 
and the fact that det(A;) = 1, we see that det(A,) = 1 for all n. 


Example 5.3.19 Vandermonde matrix and determinant. A matrix V,, of the type 


Xi, 4X2 Xn 
2 2 2 
%y XQ Xn 
V, = ; : 
n—1 ,n—1 n—1 


146 5 Determinants and Forms 


is called the Vandermonde matrix, and det(V,,) of a Vandermonde matrix is called a 
Vandermonde determinant. We show, by induction, that det(V,) = IL oe, (x; — Xj). 
Ifx; = x;forsomei 4 j, then two columns of the Vandermonde matrix are same, and 
so the det(V,,) = O, and then there is nothing to do. Assume that all x;, x2, ..., x, are 
distinct. For n = 1, there is nothing to do. For n = 2, det(V2) = x2 — x;. Thus, the 
result is true form = 1 and n = 2. Assume that the result is true for allm < n> 3. 
Let f(t) = det(V,,(t)), where the matrix V,,(7) is obtained by replacing x, = fin 
the Vandermonde matrix V,,. Clearly, f(t) is a polynomial in ¢ of degree n — 1. Each 
xj,i<n—1isaroot of f(t) for if we replace t = x;,i <n-—1, then det(V,;) = 
0. Thus f(t) = a(t — x1)(t — x2)--+(t — X%»-1) for some constant a which is the 
coefficient of t’~! in f(t). Clearly, the coefficient t”~! is det(V,_). By the induction 
hypothesis a = det(Vn-1) = []n>i;(%i — xj). Substituting the value of a we find 
that det(V,) = [r>isj i — xj). 


Determinant as Volume Form 


Consider a parallelogram in R? with co-terminus edges OP and OQ with P and 


Q having position vectors 7, = [a11, a12] and 77 = [a2 , a2] respectively. The 
area {2 of the parallelogram is given by 2 = base x height =| Tr; || 72 |, where 
mh = h- <h, iF a > ia is the resolution of 77 orthogonal 7;. Now, 
<1 2 _ a= — F ia 
(mo Do =< <2 ing > ie 27 <2 Ta > Ta > 
= |2 <r, Ti> 
ar | r2 | aid 
Thus, 
2 — 2) — 2 ey) 
WM =|nVlnl -— <n, n> 
Fir Five" 
= det(| __, ) = det(AA'), 
Far F2Fy' 
where 


Similarly, if we take a parallelogram in R? with co-terminus edges OP and OQ 


with P and Q having position vectors 7; = [a 11, 412, 413] and 7 = [d), an, a3] 
Here el, then the area 2 of the parallelogram is base x height =| 7 || 72+ |, 
where + = H— <h, a ey is the resolution of rz orthogonal 7. It turns out 


again that the area QQ = ./det(AA’), where 


ai, a\2 a 
Ae 11 412 413 | 
421 422 423 


5.3 Alternating Forms, Determinant of an Endomorphism 147 


More generally, it follows by induction that the volume V of the parallelepiped 
in R” whose co-terminus edges are given by the vectors {71, To, ..., 7m} is given by 


V = Jdet(AA‘), where 


Tn 


In particular, the volume V of the parallelepiped in R” whose co-terminus edges are 


given by vectors {71, 72, ..., 7} is given by V = det(A), where 
Ty 
Ty 
A= 
Th 


Example 5.3.20 The area Q of the parallelogram in R? with co-terminus edges given 
by vectors [1, 0, 1] and [2, 1, 1] is given by 


12 
Der ey 01]) = der(| 36) = V6. 
BAA | a 


Exercises 


5.3.1 Let V be a vector space of dimension n and W a vector space of dimension 
m. Find a basis and also the dimension of A,(V, W). In particular, find a basis and 
show that the dimension of A,(V, F) is ”C,. 


5.3.2 Find the determinant of the matrix 


1111 
123 4 
149 16 
1 8 27 64 


Find the co-factor, adjoint, and also the inverse of the matrix. 


5.3.3 Find the determinant of the matrix A, given by 


148 5 Determinants and Forms 


X1 X2° Xn 
XT XR + + Xp 
An = 
n yn n 
x1 % Xn 


5.3.4 Show that the determinant of an orthogonal matrix is +1. Let A be a 2 x 2 
orthogonal matrix whose determinant is 1. Show that it is a rotation matrix in the 
sense that there exist a such that 


cos? sin@ 
—sinO cosé |\~° 


Show further that if det(A) = —1, it represents reflection in the plane about a line 
passing through origin. 


5.3.5 Find the determinant of an x n matrix A = [aj], where aj = max(i,j). 


5.3.6 Show that the determinant of the n x n matrix A, whose ith row is [(@i — 1)n + 
1,@—1)n+2,..., in] is 0 for n > 3. What is the rank A,,? 


5.3.7 Show that the determinant of a unitary matrix is a complex number whose 
modulus is 1. Conversely, show that any complex number with modulus | is deter- 
minant of a unitary matrix. 


5.3.8 Let R be acommutative integral domain. Then it can be considered as a subring 
of a field F. Let A be a matrix with entries in R and so in F. Show that A has inverse 
with entries in R if and only if det(A) is a unit in R in the sense that its inverse is in 
R. Deduce that a matrix A with entries in Z has inverse with entries in Z if and only 
if det(A) = +1. 


5.3.9 Let p be a permutation of degree n, and A, is the matrix obtained by permuting 
the rows of the identity matrix through permutation p. Show that A, is an orthogonal 
matrix whose determinant is sign(p). 


5.3.10 Suppose that A is invertible. Show that A” is also invertible. Is the converse 
true? Support. 


5.3.11 Suppose that A is a 3 x 3 invertible matrix with determinant 3. Find the 
determinant of A“. 


5.3.12 Suppose that A is a invertible 4 x 4 matrix such that det(A“”) = 8. Find 
the determinant of A. 


5.3.13 Can we find a invertible 4 x 4 rational matrix A such that det(A“’) = 2? 
Support. 


5.3 Alternating Forms, Determinant of an Endomorphism 149 


5.3.14 Let A be a real skew-symmetric n x n matrix, where n is odd. Show that 
det(A) = O. Deduce that A is not invertible. 


5.3.15 Find the solution of the following system of linear equation using the 
Cramer’s rule: 
xty+zt+t=3 
2x + 3y+4z2+tr=5 


4x + 9y + loz + t= 2 


8x + 27y + 642 + ¢ = 1. 
5.3.16 Let {71, 72, ..., 7-1} be an ordered set of n — | vectors in R”. Define a map 
f from R” to R by 
x 
ry 
rT 
f@) = Det : 
Tr-1 


Show that is a linear functional on R”. Deduce that there is a unique vector @ such that 

f@) = <x, u >. Let us call this vector uv the vector product of {77, 72, ..., 7,1}. 
Observe that on R?, the concept agrees with that of usual vector product on R*. Show 
that the vector product on R” defined above is an — 1 alternating form on R”. 


5.3.17 Find the vector product in R* of the set of three ordered vectors {(1, 1, 1, 1), 


(1, 2,3, 4), (1, 4, 9, 16)}. Determine also the volume of the parallelepiped formed 
by the three given vectors as co-terminus edges. 


5.3.18 Check if 


| det ; [Sree | ete | 


Th 


Determine the condition under which the equality holds. 


150 5 Determinants and Forms 


5.4 Invariant Subspaces, Eigenvalues 


Let T be a linear transformation on V. A subspace W of V is called an invariant 
subspace if T(W) C W. Clearly, the zero space {0} and the whole space V are invari- 
ant subspaces. These invariant subspaces are called improper invariant subspaces. 
Other invariant subspaces are called proper invariant subspaces. 

A linear transformation need not have any proper invariant subspaces. For exam- 
ple, the rotation in R? through the angle 5 radian has no proper invariant subspace. We 
shall be mainly interested in one-dimensional invariant subspaces. Let T be a linear 
transformation on a vector space V. Anelementx € V — {0} is called an eigenvector 
or a characteristic vector or a proper vector if there is a Ain F such that T(x) = Ax. 
Clearly, such a 2 is unique, and it is called the eigenvalue or characteristic value or 
a proper value corresponding to the eigenvector x. If x 4 0 is an eigenvector corre- 
sponding to the eigenvalue A, then T(ax) = aT(x) = adx = ax. This means 
that the subspace < x > generated by x is a one-dimensional invariant subspace 
of which any nonzero vector is an eigenvector corresponding to the same eigen- 
value. Conversely, any nonzero element of a one-dimensional invariant subspace is 
an eigenvector, and all nonzero vector of this invariant subspace corresponds to same 
eigenvalue. 

The eigenvalues and eigenvectors of a matrix A are defined to be the eigenvalues 
and eigenvectors of the linear transformation L4. Thus, a nonzero column vector xX ; 
in F” is an eigenvector of A corresponding to the eigenvalue 2 if L4(X ‘) =A-X = 
AX 
Theorem 5.4.1 Let V be a finite-dimensional vector space over afield F. Then € F 
is an eigenvalue of a linear transformation T on V if and only if det(AI — T) = 0. 
In turn, X € F is an eigenvalue of an x n matrix A with entries in F if and only if 


det(AMl, — A) = 0. 


Proof By the definition, is an eigenvalue of a linear transformation T on V if and 
only if there is anonzero element x € V suchthatO = Ax — T(x) = (AI — T)Q). 
This is equivalent to say that det(Al — T) = 0. The rest of the statement follows 
if we apply the result for the linear transformation L4 determined by the matrix A. # 


LetA = [aj] be an x n matrix with entries in F. Then x/, — A is a matrix with 
entries in the polynomial ring F[x]. The determinant of xJ, — A = [xdj — aj] 
defined again by the formula 


n 
Xpes, Sign(p) []_, eo: — Api) 


is a polynomial in F[x]. If we follow the rule of expansion of determinant, we 
see that it is a polynomial of degree n in F[x]. This polynomial is called the 
characteristic polynomial of A and is denoted by ¢4(x). 


5.4 Invariant Subspaces, Eigenvalues 151 
Example 5.4.2 Consider the matrix A given by 


101 
A= |010 
111 


The characteristic polynomial ¢,4 (x) is given by 


KS WO (St 
ga(x) = Det(xlz — A) = Det Oo sgeat  O = Bx? Oe 
See ee 


Definition 5.4.3 The determinant of ar x r submatrix of A all of whose diagonal 
entries are also the diagonal entries of A is called a principal r — minor of A. 


Thus, the principal 1—minors are precisely diagonal entries of A. There are 3 
principal 2—minors of a 3 x 3 matrix which are det(A,,), det(A22) and det(A33). 
How many principal r x r principal minors of a n x n matrix will be there? The 
following result follows immediately from the expansion rule of the determinant. 


Proposition 5.4.4 LetA bean x nmatrix. Then the characteristic polynomial $a (x) 
of A is given by 


bax) = x" — ax! + ax? 4 0 + (HD ax” He + (H1)" an, 


where a, is sum of principal r— minors of A. ft 


In particular, it follows that a; is the sum of diagonal entries of A. This is called 
the trace of A. Similarly, a, is the determinant of A. 


Corollary 5.4.5. The eigenvalues of a matrix A with entries in a field F are precisely 
the roots of the characteristic polynomial @4(x) which are in F. 


Proof The roots of ¢4(x) are precisely those A for which det(AJ — A) = 0. Equiv- 
alently, AJ — A is singular. This is further equivalent to say that there is a nonzero 
vector X such thatAX. = AX’. t 


Corollary 5.4.6 Let A and B be similar matrices with entries in a field F. Then 

(i) bax) = bp). 

(ii) Sum of principal r-minors of A is same as the sum of the principal r-minors of B. 
(iii) Trace of A is same as trace of B, and the determinant of A is also same as that 
of B. 

(iv) Eigenvalues of A are same as those of B. 


Proof (ii), (iii), and (iv) are consequence of (i). Thus, it is sufficient to show the (1). 
Suppose that B = PAP~!. Then 


152 5 Determinants and Forms 


p(x) = det(xI, — B) = det(PxP7! _ PAP-') - 
detP(Det(xIn — A))det(P~!) = det(xI, — A) = i. ‘ 


Remark 5.4.7 Matrices having same characteristic polynomial (and so same sum of 
principal r-minors, same trace, same determinant, and same eigenvalues) need not 
be similar. Consider, for example, a nonidentity uni-upper triangular n x n matrix A. 
Then the characteristic polynomial of A is clearly (x — 1)” whichis same as that of 
the identity matrix. But identity matrix is similar only to identity matrix. 


Example 5.4.8 The characteristic polynomial d(x) of the matrix A in Example 5.4.2 
is da(x) = x9 — 3x* — 2x. Thus, the eigenvalues A (which are the roots of the 
characteristic polynomial) are 0, 1, and 2. We also find eigenvectors. Suppose that 
the column vector 


is an eigenvector corresponding to 0. Then 


101 xX] X{ 0 
010 X2 =0- Xx. = 0 
111 X3 X3 0 
Solving, we get that x) = —x3,andx2 = 0. Thus, eigenvectors of A corresponding 


to the eigenvalue 0 are the set of nonzero vectors of the form 


a 
0 


—a 


Similarly, eigenvectors of A corresponding to the eigenvalue | are the set of nonzero 
vectors of the form 


and that corresponding to eigenvalue 2 are the set of nonzero vectors of the form 


a 
0 


a 


Remark 5.4.9 A square matrix A with entries in a field F' need not have any eigen- 
value (in F’). For example, the characteristic polynomial of the matrix 


5.4 Invariant Subspaces, Eigenvalues 153 


[0] 


is x? + 1 which has no real root and so the matrix has no eigenvalues. 


Let T be a linear transformation on a finite-dimensional vector space V. Then the 
matrix representations of T with respect different choices of bases are all similar. 
As such, we can define the characteristic polynomial of T to be the characteristic 
polynomial of any matrix representing T. Eigenvalues, trace, determinant of a linear 
transformation are related to the characteristic polynomial of T. 

A linear transformation T on a finite-dimensional vector space V is said to be 
semi — simple or diagonalisable, if there is a basis of V with respect to which the 
matrix of T is a diagonal matrix. This is equivalent to say that there is a basis 
{x1,X2,...,Xn} of V such that T(x;) = ,x; for some A; € F. In other words T is 
diagonalisable if and only if there is a basis of V consisting of eigenvectors of T. 
We know that matrices corresponding to different bases are similar, and the similar 
matrices represent same linear transformation corresponding to different choices of 
bases. It is also clear that if T is diagonalisable, then any linear transformation similar 
to T is diagonalisable. 

An X n matrix A with entries in F is said to be diagonalizable or semi — simple 
is L, is diagonalisable. This is equivalent to say that A is similar to a diagonal matrix. 
Thus, an x n matrix A is digonalizable if and only if F” has a basis consisting of 
eigenvectors of A. 


Theorem 5.4.10 Let T be linear transformation on a vector space V of finite dimen- 


sion. Let 1, 2,..., A+ be a set distinct eigenvalues of T. Let x, X2,...,X, be the 
corresponding eigenvectors. Then {x,, X2,..., X,} is linearly independent. 
Proof Suppose contrary. Then {x;, x2, ..., x,} is linearly dependent. Since the eigen- 
vectors are nonzero, there is a minimal linearly dependent subset of {x,, x2, ...,x;} 
which, of course, contains at least two elements. After rearranging, we may suppose 
that {x1, x2, ..., Xs} is a minimal linearly dependent subset of {x1, x2, ...,x,}. Then 
there exists a), Q2,..., Qs not all zero such that 

QyX, + A2xX. +--+ + asxy = O------ : (5.1) 


All a; in Eq.5.1 are nonzero, for otherwise it will contradict the assumption that 
{x1, X2,..., Xs} is a minimal linearly dependent subset. Applying the linear transfor- 
mation T on Eq.5.1, we get that 

OyAX, + AQAA +++ + a,AsXy = O--e eee ; (5.2) 
Multiplying Eq.5.1 by \1, we get that 


Ayayx, + Ayarx. + +++ + AlAsxXy = O--- => . (5.3) 


154 5 Determinants and Forms 
Subtracting Eq.5.3 from 2, we get that 
an(Az — Ay)x, + 03(A3 — Aq)x3 + e+ + As(As — As)Xs = OL 


Since each a; 4 0 and A; € A; fori > 2, it reduces to a contradiction to the suppo- 
sition that {x,, x2, ..., Xs} is minimal linearly dependent set. t 


Corollary 5.4.11 Let T be linear transformation on a vector space V of dimension 
n. Suppose that T has n distinct eigenvalues. Then T is diagonalisable. 


Proof Suppose that T has distinct eigenvalues A, A2,..., An, and {x1, x2,..., Xn} 
the corresponding of eigenvectors of T. From the above theorem {x), x2,..., Xn} is 
linearly independent. Since the dim V = n, it is a basis of V. Clearly, the matrix of 
T relative to this basis is the diagonal matrix diag(\,, Az, ---, An)- tt 


Corollary 5.4.12 [fan x n matrix A with entries in F has n distinct eigenvalues 


in F, then A is similar to a diagonal matrix. Indeed, if 1, 2, ..., An are distinct 
eigenvalues of A, and Ty’, 72',...,7,' are the corresponding column eigenvectors, 
then the matrix P = [7,', 72", ...,7)'] is anonsingular matrix such that P-'AP = 


diag(\1, 2, festa An): 


Proof From Theorem 5.4.10 the set {77', 72", ..., 7’} of column eigenvectors form 
a basis of F” (here the elements of F” are treated as column vectors). Thus, P is 
invertible. Suppose that P~' is the matrix whose ith row is 3;. Then 5; 7° = 1 if 
i =j and 0 otherwise. Further, since the columns 7;’ are eigenvectors of A, AP = 
[Airi!, A272", ..., AnTn’]. Hence the ith row jth column entry of P~'!AP is Aj5i Fj’. This 
is A; if i = j and 0 otherwise. This confirms that P-'AP = Diag(A,, A2,---; An)-t 


Thus, any upper triangular matrix with all diagonal entries distinct is similar to 
a diagonal matrix because diagonal entries of triangular matrices are precisely the 
eigenvalues of the matrix. The above result need not hold in case all eigenvalues are 
not distinct. For example, a nonidentity uni-upper triangular matrix is not similar to 
any diagonal matrix. This is because all eigenvalues of unitriangular matrices are 1, 
the only diagonal matrix all of whose eigenvalues are | is the identity matrix, and 
the identity matrix is similar only to the identity matrix. 

We illustrate the result by means of an example. 


Example 5.4.13 Consider the matrix 


110 
A= |023 
003 


The eigenvalues of A are 1, 2, and 3 which are all distinct. Hence A is similar to a 
diagonal matrix. We find a nonsingular matrix P such that P~'AP = diag(1, 2, 3). 
We first find eigenvectors corresponding to these eigenvalues. Suppose that the vector 


5.4 Invariant Subspaces, Eigenvalues 155 


x1 
X= X2 
X3 

is an eigenvector corresponding to the eigenvalue 1. Then A-X = X. Equating 


rows, we find that x} +x. = x;, 2x2 + 3x3 = x2 and 3x3 = x3. This implies that 
x2 = O = x3, and x, is arbitrary. Thus 


is a typical eigenvector of A corresponding to the eigenvalue 1. Using same process 
we see that 


eyte = 1 
0 


is a typical eigenvector corresponding to the eigenvalue 2, and 


3 
3e; + 6e2 + 2e3 = | 6 
2 


is an eigenvector corresponding to eigenvalue 3. Thus, the transformation matrix 


113 
P= {016 
002 


is a nonsingular matrix such that P-'AP = Diag(\, 2, 3) (confirm it). 


Let T be a linear transformation on V, and A an eigenvector of T. Consider 
V, = {ve V|T(v) = Av}. Then V) is a subspace of V (consisting of eigenvectors 
corresponding to A together with 0). This subspace is called the A-eigenspace of T. 


Corollary 5.4.14 A linear transformation T on a finite-dimensional vector space V 
is diagonalisable if and only if V is direct sum of the eigen subspaces of T. 


Proof Suppose that 
V=V,8V,0°-:-8V),, 


where V), is Aj-eigenspace. Clearly, A; are distinct. Let S; be a basis of V;. Then 
S = U_, S; is a basis of V consisting of eigenvectors of T. tt 


Corollary 5.4.15 A linear transformation T on V is diagonalisable if and only 
if there exists a set {ry, 2,..., Ar} of distinct eigenvalues such that dimV = 
ur, dim(V),). 


156 5 Determinants and Forms 


Proof Since eigenvectors corresponding to distinct eigenvalues are linearly indepen- 
dent, under the assumption V becomes direct sum of its eigenspaces. tt 


Let F[x] denote the set of all polynomials with coefficients in the field F. F'[x] is 
a commutative integral domain (with respect to the usual addition and multiplication 
of polynomials in F'[x]) in the sense that it satisfies all the postulate of a field except 
the existence of the multiplicative inverse of a nonzero element in F'[x] (Indeed, there 
is no polynomial f(x) such that xf(x) = 1). Let T be a fixed linear transformation 
on a vector space V over a field F, and 


f(®) = ao + ayx + Ge hence ee 
a polynomial in F[x]. Then f (7) is a linear transformation on V defined by 
FQ) = aol + aT + er + .e. + a,T". 


We can extend the multiplication on the vector space V by the members of F to 
the multiplication by the members of F[x] by defining f(x)-v = f(T)(v). Then 
V becomes a F[x]-module in the sense that it satisfies all the postulates of a vector 
space with the field F replaced by the polynomial ring F[x]. 

Similarly, if A is an x n matrix with entries in a field F, and f(x) a polynomial 
in F[x], then we have the matrix f(A) defined by 


f(A) = aoly + aA + aA? + +--+ + a,A". 


It may be observed that if A is matrix of T with respect to certain basis, then f(A) 
is the matrix of f(7) with respect to same basis. It may also be observed that if 
X is an eigenvalue of T with eigenvector v, then f(T)(v) = f(A)v, and so f(A) 
is an eigenvalue of f(T). If A is an x n matrix, then F” becomes a F[x] module 
with respect to the external product defined by f(x) -X. = f(A)-X’, where X’ isa 
column vector in F”. It is clear that the matrix product (xJ, — A)- X = Oforall 
X' © F”. Matrix theory with entries in the polynomial ring F[x] can be developed on 
the pattern it was developed for matrices with entries in F. For example, we can talk 
of adjoint of a matrix, determinant of a matrix, and the relation AY .A = det(A)I, 
holds for the matrices with entries in a field F[x] (the proof goes exactly on the same 
lines) also. 
Following is one of the most fundamental results in linear algebra. 


Theorem 5.4.16 (Cayley Hamilton Theorem) Every square matrix satisfies its own 
characteristic polynomial. More precisely, if A is a square matrix, then @4(A) = 0. 


Proof b,a(x) = Det(xI — A). The matrix xJ — A is a matrix with entries in F [x]. 
From the discussion above, it follows that 


(xl, — A) + (ly — A) = Det(xly — An = ¢4Q0In- 


5.4 Invariant Subspaces, Eigenvalues 157 
Hence, 
ba(A)-X = bae)-X = da@h-X = Cl, — AMOI, — A)-X = 0 


(see the discussion in the paragraph just above the theorem). This shows that the 
matrix ~4(A) = 0. tt 


Let A be a3 x 3 unitriangular matrix. Then its characteristic polynomial ¢4(x) = 
(x — 1)%. From the Cayley Hamilton theorem (A — 13)? = 0. In other words, 
A> — 3A? + 3A — 1 = 0. This shows that A(A* — 3A + 3) = Lz andso 
A? — 3A + 3]; is the inverse of A. Similarly, result holds for any n x n unitriangular 
matrices. This also says that if A is strictly triangularn x n matrix, then A” = 0.IfA 
is nonsingular, then the constant term (— 1)”a, in the characteristic polynomial @, (x), 
being the determinant of A, is nonzero. Since ¢, (A) is the zero matrix, (—1)"ajl, = 
—(A" = aA"! + apAP? +.) + CG 1)a,A™ + ++. + C1)" *a,-1A), 
where a, is the sum of the principal r-minors of A. It follows that the inverse of a 
matrix A, if exists, is a polynomial in A. This also gives an algorithm to find the 
inverse of A. 

A linear transformation T on V is said to be triangulable if there is a basis of V 
with respect to which the matrix is a triangular matrix. A matrix A with entries in 
F is said to be triangulable if L, is triangulable. This is equivalent to say that A is 
similar to a triangular matrix. In general, a matrix in a field need not be similar to a 
triangular matrix. Consider the matrix 


0 1 

—10 
over the field R of real numbers. This is not similar to any triangular matrix over R. 
For if it is similar to a triangular matrix over R, then it will have its eigenvalues real 


(the diagonal terms of the triangular matrix to which it is similar). But this has no 
real eigenvalues. 


Theorem 5.4.17 A linear transformation on V is triangulable if and only if there 
exists an ascending chain 


{0} = VCcCVChwCc::- Cv = V 


of invariant subspaces, called a flag of V, such that the dimension of V; is i. 


Proof Suppose that such a chain of invariant subspaces exist. By induction, we show 
the existence of a basis {x,, x2,...,X,} of V such that {x,~,41, Xn—r42,---,Xn} isa 
basis of V, for each r. Let {x,} be a basis of V,. Since {x,} is linearly independent 
subset of V2, it can be extended to a basis {x,_1, x,} of V2. Proceeding inductively, 
we find a basis {x), x2,...,xX,} of V with the required property. Since each V,_,+1 
which has basis {x;, X41, .--,Xn} is invariant under T, it follows that 


158 5 Determinants and Forms 
T(;) = AyrX, + Grr+1Xr+1 + ores + AnXn 


for each r. This means that the matrix of T with respect to this basis is triangular. 
Conversely, suppose that T is triangulable. Then there is a basis {x), x2,..., Xn} 
with respect to which it is a upper triangular matrix. But, then 


T(x,) = AprXy + Grrt1Xr+1 + ove + AnXn 


for each r. Let V, be the subspace of V generated by {X,-;+41, Xn—-r-+2,---,4Xn}. Then 
it follows that V,. is invariant under T, dimension of V, is r, and we have the chain 


{0} CV) C V2 C+++ CV. tt 


Corollary 5.4.18 A matrix A is triangulable if and only if there is a chain 
YPeCvre wou 


of subspaces of F" such that dimension of V, is r and A - Xx = La (X’) € V, for all 
r, and for all x E V,. t 


As observed earlier, a matrix need not be triangulable. The reason was that there 
need not be any eigenvalue of the matrix. A field F is called algebraically closed, 
if every polynomial in F[x] has all its roots in F. It is a fact that every field can be 
enlarged to an algebraically closed field (see Chap. 9). 


Theorem 5.4.19 Every linear transformation on a vector space V over an alge- 
braically closed field is triangulable. 


Proof Let T be a linear transformation on a vector space V over an algebraically 
closed field F. We have to show the existence of a chain 


{CVU ChwC:-cCVYy=V 


of invariant subspaces. The proof is by the induction on the dimension of V. If 
dimV = 1, then there is nothing to prove. Assume that the result is true for all 
vector spaces of dimension less than n. Suppose that the dimension of V is n > 
2. Since F is algebraically closed, the characteristic polynomial @7r(x) has a root 
A € F. Then J is an eigenvalue. Thus, there exists a nonzero vector v in V such 
that T(v) = Av. Let V; be the subspace generated by v. Then V is of dimension 
1. Since T(v) = Av € Vj, V is an invariant subspace. Consider the vector space 
W = V/V,. Then the dimension of W isn — 1, and since T(V|) C Vi, T induces a 
linear transformation T on W defined by Tiwt+v)) = Tw)+V;j. By the induction 
hypothesis there is a chain 


{Vi} = WoC W, = V2/V, C Wo = V3/V, C--- C Wr = Va/Vi = W 


5.4 Invariant Subspaces, Eigenvalues 159 


of invariant subspaces of T, such that dimension of W, is r, and so dimension of 
V,41 is r + 1. We show that each V, is invariant under T for each r. Already, V; is 
invariant under T. Let x € V,, r>2.Then T(x + V,) = T(x) + V;, belongs to 
W,-1 = V,/V;. This implies that T(x) € V,. tt 


Corollary 5.4.20 Every square matrix A with entries in an algebraically closed field 
is similar to a triangular matrix. 


Proof To say that A is similar to a triangular matrix is to say that L, is triangulable. 
The result follows from the above theorem. tt 


5.5 Spectral Theorem, and Orthogonal Reduction 


Theorem 5.5.1 Let V be a complex inner product space, and T a Hermitian linear 
transformation on V. Then all the eigenvalues of T are real. 


Proof Let \ be an eigenvalue of T. Then there exists a nonzero vector x € V such 
that T(x) = Ax. Since T is Hermitian, < T(u), v > = < u, T(v) > forallu, v € V, 
and hence 


A<xX%, xX >=< MK, x > =< TX), x > H<x, TX) > H <x, AX >= 
A<X,xX>. 


Since x £0, <x, x >40,andsoA = X. tt 


Corollary 5.5.2. Let V be a finite-dimensional complex inner product space, and 
T a Hermitian linear transformation on V. Then all the roots of the characteristic 
polynomial of T are real. 


Proof Since the field C of complex numbers is algebraically closed, all the roots of 
the characteristic polynomial of A exist in C, and they are the eigenvalues of A. The 
result follows from the above theorem. tt 


Corollary 5.5.3 All eigenvalues of Hermitian matrices are real. 


Proof The matrix A is Hermitian if and only if ZL, is Hermitian linear transformation 
on the standard complex inner product space C”. tt 


Corollary 5.5.4 All roots of the characteristic polynomial of a real symmetric matrix 
are real. In particular, every real symmetric matrix has an eigenvalue. 


Proof A real symmetric matrix can also be taken to be a complex Hermitian matrix. 
The result follows from the above corollary. ft 


Corollary 5.5.5 All roots of the characteristic polynomial of a symmetric linear 
transformation T on a real inner product space V are real. In particular, if T is a 
real symmetric linear transformation, then there is a real number 4, and x # 0 such 
that T(x) = Ax. tt 


160 5 Determinants and Forms 


Corollary 5.5.6 All nonzero eigenvalues of a skew-Hermitian matrix are purely 
imaginary. 


Proof We know that A is skew-Hermitian if and only if iA is Hermitian. Now, is 
an eigenvalue of A if and only if i\ is an eigenvalue of iA. This shows that i) is real, 
and so \ is purely imaginary. ft 


Corollary 5.5.7 Let A be a real skew-symmetric matrix. Then there is no nonzero 
eigenvalue of A. In other words, if A -X = \X, where » is real, then \ = 0 or 
ob at 
xX =0. 


Proof A real skew-symmetric matrix is also a skew-Hermitian matrix. Hence, all 
the nonzero Hons of its characteristic polynomial are epprely imaginary. Thus, there 
isnoX x 0, and real number \ % 0 such that A - X = dX. tt 


Proposition 5.5.8 Let A be an unitary linear transformation (matrix) on a complex 
inner product space, and an eigenvalue of A. Then | | = 1. 


Proof Let x 4 0 be an eigenvector corresponding to eigenvalue \. Then 


|A\2<x,x >= M<x,x>=< dx, Ww >=< T(r), TW) >=K< 
T*T(X*), xX >=<xX,x>. 


Since x £0, <x, x >#0. Hence | |? = 1. tt 


Proposition 5.5.9 Let A be an orthogonal linear transformation (matrix) on real 
inner product space, and a real eigenvalue of A. Then X = +1. 


Proof Let x # 0 be an eigenvector corresponding to a real eigenvalue \. Then as in 
the previous proposition 
MS XS Sex, FS, 


and so A? = 1.HenceX = +1. tt 


Proposition 5.5.10 Let V be an inner product space. Let T be a linear transforma- 
tion on V, and W is a subspace of V which is invariant under T. Then the orthogonal 
compliment W+ of W is invariant under T*. 


Proof Since W is invariant under T, for each y € W, T(y) € W. Let x € Wt. 
Then for each ye W, <y, T*(x) > = <TQ), x > = O. This shows that 
T*(x) € Wt. tt 


Corollary 5.5.11 Let T be a Hermitian linear transformation on an inner product 
space V, and W an invariant subspace of T. Then W* is also invariant under T. 


Proposition 5.5.12 Let T be a Hermitian linear transformation on a complex (real) 
inner product space V. Let x; and xz be eigenvectors corresponding to distinct 
eigenvalues A, and X2 of T. Then < x,, x, > = 0. In other words, Vy, L Vy,. 


5.5 Spectral Theorem, and Orthogonal Reduction 161 
Proof From previous results, A; and A are real. Further, 


Al <X1, 2 > =< AX, XH > H=K< TX), HY > H=< xX, Tio) > =< 
X11, A2X2 >= ro <X1,%2>. 


Since A; 4 Az, we see that < x1, %» >= 0. tt 


If we apply the above result for L4 on the standard inner product space, then we 
have the following corollary: 


Corollary 5.5.13 Let A be a Hermitian (real symmetric) matrix with eigenvectors 
X, and X» corresponding to distinct eigenvalues. Then XX. = 0 (X}X2 = 0). ¢ 


Theorem 5.5.14 (Spectral Theorem) Let T be a Hermitian linear transformation 
on a finite-dimensional complex (real) inner product space V. Then there is an 
orthonormal basis consisting of eigenvectors of T. 


Proof The proof is by the induction on dimV. If dimV = 1, then take any nonzero 
vector of V and divide it by its length to get a unit vector v. Clearly, {v} is an 
orthonormal basis of V, and since dimV = 1, v is an eigenvector of 7. Assume 
that the result holds for all Hermitian linear transformations on vector spaces of 
dimensions less than n. Let T be a Hermitian linear transformation on a complex 
(real) inner product space V of dimension n. Now, T, being a Hermitian linear 
transformation on a complex (real) inner product space V, has an eigenvector x,. 
Dividing x, by its length, we may assume that x, is a unit vector. Let W be the 
subspace of V generated by x}. Then V = W @ W?-. Clearly, W is invariant under 
T. Since T is Hermitian (real symmetric), W*+ is also invariant under T. It is clear 
that the restriction T/W+ of T to W+ is also Hermitian (real symmetric), and the 
dimension of W+ is n— 1. By the induction hypothesis, W+ has an orthonormal 
basis {x2, x3,...,X,} consisting of eigenvectors of T/W+ (and so of T). Clearly, 
then {x], x2, ...,X,} is an orthonormal basis of V consisting of eigenvectors of T. # 


Corollary 5.5.15 Let T be a Hermitian linear transformation on a complex(real) 
inner product space. Let {d,, A2,..., A,} be the set of all distinct eigenvalues of T. 
Then 

V= Vy, 8VY,8---@V,. 


Proof Since eigenspaces corresponding to distinct eigenvalues are orthogonal, the 
result follows from the above theorem. tt 


Corollary 5.5.16 The matrix representation of a Hermitian linear transformation 
with respect to a suitable orthonormal basis is a diagonal matrix. tt 


Corollary 5.5.17 Let A be a Hermitian (real symmetric) matrix. Then A is similar 


to a diagonal matrix. In fact there exists a unitary matrix U(an orthogonal matrix 
O) such that U*AU (O'AO) is a diagonal matrix. 


162 5 Determinants and Forms 


Proof A is Hermitian (real symmetric) matrix if and only if L4 is Hermitian (real 
symmetric) linear transformation on the standard complex (real) inner product space. 
Thus, there exists an orthonormal basis {X) ‘ : cee ee x, } of the standard complex 
(real) inner product space consisting of eigenvectors of L, (and so of A also). Now, 
the standard inner product is given by < X;, X} > = X; Xp in complex case, and 
by < Xj, X; >= Xj- x in real case, where - in R.H.S. is the matrix multiplication. 
Thus, X;-X; = 1(X;-X; = 1), and fori #j, X;-X/" = 0(%)-X; = 0).Letu 
(respectively O) denote the matrix whose ith row is X;. Then the above observation 
says that U(O) is unitary (orthogonal) such that 


X 
X 
UAUt = | |-A- Mi .,.. Se] = 
X, 
X, 
X2 
© | PALI", 2X0")... AX] = diag, Xo... An), 
x, 
where \, A2,..., An are eigenvalues of A. Similarly, if A is a real symmetric matrix, 
then OAOS = diag(Aq, r2,.--, An). t 


Remark 5.5.18 We have an algorithm to find an orthonormal basis of V consisting 
of eigenvectors of T provided that we have an algorithm to solve the characteristic 
polynomial of T (indeed, we have algorithms to solve a nth degree equation forn < 4 
(see the Chap. 9 on Fields and Galois theory)). After getting the distinct eigenvalues, 
we can find the corresponding eigenspaces, and then use Gram—Schmidt process to 
find orthonormal basis of each eigenspaces. In turn, it gives an orthonormal basis 
consisting of eigenvectors. This also gives us a method to diagonalize a Hermitian, 
and also a real symmetric matrix. We illustrate it by means of an example. 


Example 5.5.19 Consider the matrix 


3 0 
A= | O46 
lpn 3 
39 5 


This is a real symmetric matrix. We find orthogonal matrix O such that O'AO is a 
diagonal matrix. The characteristic polynomial ¢,(x) is given by 


da(x) = Det(xl — A) = (x 3)(x D(x 3) cue 


5.5 Spectral Theorem, and Orthogonal Reduction 163 


The roots of the characteristic polynomial are 1,1, and 2. We find the eigenspace R}. 
Suppose that 


u 
v 
w 
belongs to R}. Then 
u u 
A-|uv]=]o 
w w 


Equating rows, we get thatu = w. Thus 


Uu 
R} = {} v | suchthatu = wi}. 
w 


This subspace is clearly of dimension 2. Puttingu = 1 = w = v,wegetanonzero 


member of R}. Another nonzero member of R? which is not a multiple of the previous 
element is obtained by taking u = 0 = wandv = 1. Hence 


is a basis of R}. Using Gram-—Schmidt process we find the orthonormal basis 


1 =, be 
V3 V6 
si 2 

{ v3 ’ 3 } 
aS 1 
v3 V6 


sl 


as an orthonormal basis. This gives us an orthonormal basis 


SI-SI-Sl- 
Sl- ers Tal- 
alm Sg. 


164 5 Determinants and Forms 


In turn, we get an orthogonal matrix 


such that O'AO = Diag(\, 1, 2). 


Example 5.5.20 LetA bean x nreal symmetric matrix with eigenvalues A,, A2,..., 
An counted with their multiplicities. Let f(x) be a polynomial with real coefficients. 
Let j1, Mo, .--, Ln be real numbers with f(u;) = 4; for all i. Then we can find a 
real symmetric matrix B such that f(B) = A as follows: From Corollary 5.5.17, 
there exists an orthogonal matrix O such that O'AO = Diag(Ay, A2,.-., An). Let 
B = O Diag(1, 2,---, Hn)O". Then f(B) = ODiag(f (11), f (Ha), --->f (Un) 
O' = ODiag(\y, A2,..., An) O' = A. Thus, then, B is a solution of f(X) = A. In 
particular, if we take B = ODiag(1, 1, /2) O', where O is as in the above example, 
then B? = A, where A is as in the above example. Can we count the number of 
solutions of X? = A, where X is an unknown in the set of real symmetric matrices? 


Let B be any nonsingular complex (real) matrix. Then the matrix A = BB* (BB') 
is a Hermitian (real symmetric) matrix. Further, let \ is an eigenvalue of A. Then since 
< XBB*,X >= < XB, XB>is non-negative for all row vector X, it follows that all 
the eigenvalues of A are positive. Conversely, if all the eigenvalues of a Hermitian 
(real symmetric) matrix A are positive (non-negative), then, as described above, we 
can find a positive definite (positive) Hermitian (real symmetric) matrix B such that 
A = B® = BB*(BB'): 


Polar Decomposition 


Every nonzero complex number z = a + ibis nonsingular 1 x | matrix which is 
uniquely expressible in polar form as z =| z | u = re’, where r is positive definite 
1 x 1 Hermitian matrix, and u = e! is 1 x 1 unitary matrix. More generally, every 
nonsingular complex square matrix (indeed, every complex matrix) A can be uniquely 
expressed as A = B + iC, where B and C are Hermitian matrices. Following is the 
multiplicative analog of this identity called the polar decomposition. 


Proposition 5.5.21 Every nonsingular square complex matrix A can be uniquely 
expressed as A = PU, where P is a positive definite Hermitian matrix, and U is a 
unitary matrix. 


Proof Consider the matrix AA*. Since A is nonsingular, AA* is a Hermitian matrix 
all of whose eigenvalues are positive. As such, AA” is positive definite Hermitian 
matrix. Let P be a positive definite Hermitian matrix which is square root of AA”. 
Take U = P~'A. Then UU* = P~'AA*(P7')*. Again, since P is Hermitian P~! is 
also Hermitian (indeed, (P~')* = (P*)~! = P7'). Thus, UU* = P~'AA*(P7!)* = 
P-'AA*P-! = I. This shows that U is unitary, and A = PU. it 


5.5 Spectral Theorem, and Orthogonal Reduction 165 


Corollary 5.5.22. Every nonsingular matrix A with real entries can be uniquely 
expressed as A = PO, where P is a positive definite real symmetric matrix, and O 
is an orthogonal matrix. tt 


Example 5.5.23 Consider the matrix 


a a, 
2. fe. is 
a1, 

A= |" Ve 8 
go —v2 2 

3 V3 


This matrix is nonsingular. We find its polar decomposition. The matrix B = AA‘ 
is given by 


The eigenvalues of B are 1, 1, 4. Using the algorithm as described in Remark 5.5.18 
(see Example 5.5.19), we find an orthogonal matrix O given by 


1 
Wii 
1 1 1 
O= 1" ve v2 | 
go —v2 1 
V3 V3 


such that B = O' Diag[1, 1, 4] O. The positive square root P of B is given by 
P = O' Diag{\, 1, 2] O. As described in Proposition 5.5.21, U = P~'A is unitary, 
and A = PU is the polar decomposition of A. 


Singular Value Decomposition 


Let A be an x m matrix with entries in a field F, where F is C or F is R. The matrix 
AA” is a positive matrix in the sense that all its eigenvalues are non-negative. This is 
because if is an eigenvalue of AA*, then there is a nonzero row vector X such that 
AA*x* = Ax*. Hence (XA)(XA)* = AXx*. Since X is nonzero vector, it follows that 
is non-negative. 


Definition 5.5.24 The non-negative square root of eigenvalues of AA* is called a 
singular value of A. 


If A is a Hermitian, then \ is an eigenvalue of A if and only if \ is an eigenvalue 
of A? = AA*. As such, the singular values of Hermitian matrices are precisely the 
absolute values of their eigenvalues. 


Example 5.5.25 The singular values of the matrix 


eu] 


166 5 Determinants and Forms 


are A/ 2. n/ 2, for the eigenvalues of 


are 2, 2. 


Proposition 5.5.26 Let A be an x m matrix and X* is a unit column eigenvector of 
AA* with \ as corresponding eigenvalue. Then (|| XA ||)? = X. In turn, || XA || is 
the corresponding singular value. Further, if x is a row eigenvector of AA*, and y is 
a row vector orthogonal to x, then xA and yA are orthogonal to each other. 


Proof Under the hypothesis, (|| XA ||)*> = XA(®A)* = XAA*X* = \AXX* = X. 
Next, suppose that x is a row eigenvector of AA* with associated eigenvalue . Then 
xAA* = Xx. In turn, 


YA(XA)* = yAA*x* = y(XAA*)* = Ayx*. 
The result is evident. tt 


Corollary 5.5.27 Let A be an x m matrix. Then rank of A is p if and only if there 
are exactly p strictly positive eigenvalues of AA*. Equivalently, there are exactly p 
nonzero singular values of A. 


Proof Suppose that A; > Az > --- A, are nonzero eigenvalues of AA* and the rest 
of the n — p eigenvalues are 0. Let {7*,7*,...,7,",...,7,"} be an orthonormal 
basis of F” (considering F” as space of columns) consisting of column eigenvectors 
of AA* with 7;*, 72", ..., 7," corresponding to eigenvalues \j > A2 > --- > Ap, 
respectively. It follows from the above proposition that || 7;A || is nonzero if and 
only if i < p. Further, 7,A(GjA)* = FAA*T* = AFT = A; for i=j and 0, 


otherwise. This shows that {7;A,7A,...,7,A}, being an orthogonal set, is lin- 
early independent. Since the rest of 7;A, i > p are zero, it follows that p is the 
rank of A. t 


Theorem 5.5.28 (Singular value decomposition) Let A be a n x m matrix. Then 
there exists a unitary/orthogonal n x n matrix U, and am x m unitary/orthogonal 


matrix V such that UAV = %&, where X is an x m matrix whose first p diagonal 
entries are nonzero singular values 0), 02, ..., 0, of A in non-ascending order, and 
the rest of the entries are 0. In turn, A = U*XV*, where U* and V* are again 


unitary/orthogonal matrices, and & as described. 


Proof Let {7\*, 72", ..., 7", ..-, T*} be an orthonormal basis of F” (considering F” 
as space of columns) consisting of column eigenvectors of AA* with 7“, 72*, ... , 7)” 
corresponding to nonzero eigenvalues Ay > A2 > --- > A,. From Proposition 5.5.26 
and Corollary 5.5.27, it follows that {A*7|“, A*72*,..., A*7,*} is an orthogonal 
set of column vectors in F’”. Let 5j* denote the column vector AT. Then 


{57", 5*, ..., 5,*} is an orthonormal set of column vectors in F”. Embed it in to 


5.5 Spectral Theorem, and Orthogonal Reduction 167 


an orthonormal basis {5,*, 52*, ..., 5m"} of F’”. Take U to be the n x n matrix whose 
ith row is 7;, and V the m x m matrix whose jth column is 5". Then U and V are uni- 
tary/ orthogonal matrices. The ith row jth column entry c; of UAV is given by 7;A 5;". 
Further, 7A = 0 for alli > p,TAGjA)* = Ofori Ajand7A(FA)* = A = a, 


Now, it is evident that c;; = o; for alli < p and 0, otherwise. tt 


The proof of the theorem is algorithmic. We illustrate it by means of the following 
example. 


Example 5.5.29 Consider the matrix A given by 


Then 


The eigenvalues of AA’ are 3, 1. Thus, the singular values of A are /3, 1. A unit eigen- 
= : é 1 1 — : : 1 1 
vector 7; corresponding to 3 is lz: Ris ], and that 7 corresponding to | is [FZ - wl 
zl and mA = [—y, 0, J]. Thus, = [J V3. Zl 
ands; = [—-4, 0, mal We extend {5;, 53} to an orthonormal basis by adjoining 
1 


Wet a zl. The matrix U is given by 


the matrix V is the transpose of 


ple 2 1 
v6 3 V6 
—-L 9g 
geet aa ag (8 
v3 V3 V3 


and the matrix & is given by 


Further, A = U'XV'. 
Geometry of Orthogonal Transformation 


Recall that a subspace H of dimension n — 1 of the Euclidean space R” is called a 
hyperplane, and a translate x + A is called an affine hyperplane. 


168 5 Determinants and Forms 


Proposition 5.5.30 Let H be a hyperplane in the Euclidean space R". Then there is 
a unit vector x € H+. Further, if ¥ is any unit vector in H+, theny = +x. 


Proof Let {51,52,..., 5,1} be an orthonormal basis of H. Using Gram—Schmidt 
process, enlarge it to an orthonormal basis {357, 53,..., 5;-1, X} of R”. Then X is a 
unit vector in H+. Let 


Y = aS, + A282 + +++ + Gy-1Sp-1 + aX 


be a unit vector which is a member of Ht. Then 


ai 2 


l=<y,y>=a<y,x>=a’: 5 


<xX,X¥>=a. 


Hence a = +1. Further, 0 =< y, 5; >= q; for all i. This shows that y = +. t 


If x is a unit vector, the hyperplane H consisting of vectors orthogonal to X is 
denoted by Hg. It follows that Hy = Hy if and only ifx = zy. 


Proposition 5.5.31 Let x be a unit vector in R", and Hz the corresponding hyper- 
plane. Then the map oz from R" to itself defined by 


oxy) = y-—2<y,x>xX 


is the unique linear transformation which fixes the members of Hy, and takes X to its 
negative. Further, ox is an orthogonal transformation with determinant —1 (refer to 
the Exercises 4.4.35—4.4.40). 


Proof Clearly, ox is a linear transformation. If < y, x > = 0, then by the definition, 
ox(y) = y. Also og(%) = — X. Further, 

< ox(¥), Ox(Z) > 

=<y-—2<y,xX>%x%,7-2<2,xX>x> 

=< yy, 2S. 

This shows that oy is an orthogonal transformation. Also, the matrix represen- 
tation of oz with respect to the orthonormal basis {57,53,...,5,-1,X} of R’, 
where {57,52,...,5,—1} iS an orthonormal basis of Hz is the diagonal matrix 
diag(i, 1,..., 1, —1). Hence, its determinant is —1. Finally, if 7 is any linear trans- 
formation with the required property, then again the matrix representation of 7 with 
respect to the basis {57, 52, ..., 5,1, Xx} is diag, 1,...,1, —1). Hence, rT = o 7. 


Definition 5.5.32 The transformation o; is called a hyperplane reflection (indeed, 
it is reflection about the hyperplane A). 


Note that og = oy; if and only ifx = +y 
Recall that an inner product space V is said to be the orthogonal sum of its subspaces 
Vi, Vo,..., V, if 

V=eV+V+--- 4+ V,, 


5.5 Spectral Theorem, and Orthogonal Reduction 169 
and for i 4 j, the elements of V; are orthogonal to V;. Symbolically, we write it as 
V=V,1LV,1--- L V,. 


In particular, if W is a subspace of V, then V = WLW?. 


Theorem 5.5.33 Let T be an Euclidean orthogonal transformation on R". Then 


there exist subspaces V, W, and two-dimensional subspaces P,, Pz, ..., P; together 
with angles 0 < 0; < 0) <,...,< 0) < a such that the following hold: 
TR" = VLWLP,;LP2L--- LP). 


2.T(X) = xforallx eV. 

3.T(x) = —xforallx € W. 

4. The restriction of T to P; is a rotation through an the angle 6, in the plane Py. 
In other words, if {uz, Vx} is an orthonormal basis of P,, then 
(i) Tu) = cosOuz + sinO;,v,, and 
(ii) TUR) = —sinOytg + cosOyd, for all k < 1. 


Proof The proof is by the induction on n. For n = 1, it follows trivially. Assume that 
the result is true for all m < n. Let T be an orthogonal transformation on R"+!. Then 
T induces a unique linear transformation T from the standard complex vector space 
C"*! to itself by putting T(x) = XM(T), whereX = [x, X2 --+ X,41]18 arow vector 
in C”*!, and M(T) is the matrix representation of T with respect to the standard basis 
of R”*+!. Note that the matrix representation of T with respect to the standard basis of 
C"+! is the same as M(T). Let be an eigenvalue of M(T) which may be a complex 
number. Then there is a nonzero complex unit vector x = [x x2 --+ X,41] such that 
xM(T) = Xx. Since M(T) is areal matrix, 


XM(T) = XM(T) = XX, 


where x= (Xm: Xn+1] denote the complex conjugate of the complex vector x, 
and \ the complex conjugate of \. This shows that A is also an eigenvalue of M(T), 
and x is a corresponding eigenvector. Since M(T) is orthogonal, 


|A\ P= WAR = XM(T)G@M(T))' = 3@! =| *IP. 


This shows that | \| = 1,andso\ = e” for some 0. 
Now, suppose that M(T) has a real eigenvalue A. Then A = +1. Suppose again 
that 4 = 1 is an eigenvalue of M(T), and X is a unit eigenvector associated to 


1. Clearly, then x is a real vector. Consider the corresponding hyperplane H;. The 
dimension of H is n. Further, if y € Hy, then 


<x, Ty) >=<T(), TO) >=<Xx, y> 


(for T is orthogonal). This shows that T restricted to Hz is an orthogonal transfor- 
mation on Hy. Also R’+! = <x > LH,. By the induction hypothesis, there exist 


170 5 Determinants and Forms 


subspaces V’, W, and two-dimensional subspaces P, P2,..., P; of Hz together with 
angles 0 < 0; < 02 <,..., < 6; < a such that the following hold: 
1. He = V'/LWLP)LP2L--+- LP). 


2.T(%) = Xforallx eV’. 

3.T(x) = —xforallx € W. 

4. The restriction of T to P; is a rotation through an the angle 6, in the plane P; 
for each k < 1. 

Taking V = <x > LV’, the result holds for R"*!. If 1 is not an eigenvalue but 


—1 is an eigenvalue, then a similar argument proves the result with V = {0}. 
Assume that M(T) has no real eigenvalues. Note that in this case n + | will be 
even 2m (say). Let 6, be the smallest positive real number such that \ = e!” 


is an eigenvalue, and x a corresponding complex eigenvector. As observed ear- 
lier, e~'”' is also an eigenvalue with X a corresponding eigenvector. Then x +X 


and i(x — X) are nonzero real vectors. Take 7 = TET ,and i) = ie, Then 
{%, Vi} is an orthonormal subset of R’+! which generates a subspace P;. Then 
T(m) = mMM(T) = me Cet eX) = cosb\m + sinO;vj, and simi- 
larly, T(vy) = —sinO\u; + cos®\v;. Let U = PP Then U is of dimension 


2(m — 1), R"*! = PLU, and T restricted to U is an orthogonal transforma- 
tion on U. By the induction hypothesis, there exist two-dimensional subspaces 


Pz, P3,..., Pm together with angles 0 < 6) < 63 <,...,< On < 7m with 0; < 
such that U = P21P31---LP,,, and T restricted to each P, is a rotation through 
the angle 6;. Clearly, R’*! = P,LP:1P31---LPy. it 
Corollary 5.5.34 Suppose that the dimension of W in the above theorem is m, and 
{W1, W2,..., Wm} an orthonormal basis of W. Then 

T= OW, Ow, * + * OOW,OPP,OPP,O*** OPP,» 


where ow; is the reflection about the hyperplane Hy, and pp, denote the rotation 
through an angle 6; in the plane Pj, and it is given by 
(i) pp) = x for allx in V, W and Py, k #j, 


(ii) pp, Uj) = cos; + sind;v;, 
(iii) pp, Uj) = —sinOjuj + cos6;v;, where {u;, vj} is an orthonormal basis of 
Pi,j <i. i 


Corollary 5.5.35 The transformation T in Theorem5.5.33 is a composition of m + 
21 hyperplane reflections, wherem = dimW. 


Proof From the above corollary, it is sufficient to show that the rotation pp, is com- 
position of two hyperplane reflections. Since, 


cos#; sind; | | 1 0 cos6; sin; 
—sin6; cosh; | ~— | O-1 sin0; —cos0; |’ 


: = 6; . Oe 
it follows that pp, = ogoo7, where V; = coszuj; + sinzV;. tt 


5.5 Spectral Theorem, and Orthogonal Reduction 171 


Corollary 5.5.36 The matrix representation of the orthogonal transformation T 
on R” described in Theorem5.5.33 with respect to a suitable orthonormal basis is 


A = [aj], where 
(i) ay = 1foralli<r = dimV (r may be 0 also), 
(ii) ay = —lforalli = r+j<r+m, wherem = dimW (m may also be 0), 
(iii) ay = cos0j fori = r+m+2j—-landi = r+m+%j<b 
(iv) diz, = sin6; and aj4j;, = —sinO; fori = r+m+2j—-1j<lb 


(v) the rest of the entries are 0. 


Proof Taking orthonormal bases of V, W, and of the two-dimensional subspaces 
P, Po,..., P; together, we get an orthonormal basis of R” with respect to which the 
matrix representation is the required one. tt 


Corollary 5.5.37 Every orthogonal matrix A is orthogonally similar to a matrix 
of the form described in the above corollary. More explicitly, there is an orthog- 
onal matrix O such that OAO' is the matrix of the form described in the above 
corollary. i 


Corollary 5.5.38 An orthogonal transformation with determinant I is composition 
of even number of hyperplane reflections, and with determinant —1 is composition 


of odd number of hyperplane reflections. tt 
Corollary 5.5.39 Two orthogonal matrices A and B are similar if and only if they 
have same set of eigenvalues counted with their multiplicities. ft 
Exercises 


5.5.1 Find invariant subspaces of the differential operator D on the space go, of 
polynomials of degree at most n over reals. Find its characteristic polynomial, and 
also the eigenvalues. 


5.5.2 Consider the matrix 
001 


A=1]100 
010 


Show that the cube roots of | are precisely the eigenvalues of A. Show that the matrix 
is diagonalisable over the field C of complex numbers. Find a complex matrix P such 
that P~'AP is a diagonal matrix. Is it diagonalisable over the field R of reals? 


5.5.3 Show that the matrix 
102 


A= |047 
002 


is diagonalisable over the field R of real numbers, and find P such that P~'AP is a 
diagonal matrix. 


172 5 Determinants and Forms 


5.5.4 Show that the matrix 
234 


A= |025 
002 


is not diagonalisable even over the field C of complex numbers. 


5.5.5 Show that the matrix 


2 


A 1 — cosasina cosa 
~ —sin-a 1+sinacosa 


is similar to an upper triangular matrix over R. Find P such that PAP™! is upper 
triangular. What is PAP~!? Show that it is not similar to a diagonal matrix even over 
the field C of complex numbers. 


5.5.6 Show that a 2 x 2 matrix over reals all of whose off diagonal entries are 
positive have all its eigenvalues real. Determine a necessary and sufficient condition 
on the entries of a2 x 2 matrix for its diagonalisability. 


5.5.7 Suppose that 77 — 57 + 6 = O. Determine all possible eigenvalues of T. 
5.5.8 Suppose that A is nonsingular. Show that AB and BA have same eigenvalues. 


5.5.9 Let A be an x n matrix. Suppose that A” = 0 for some m > 1. Show that 
A” = 0. 


5.5.10 Let A be a nilpotent n x n matrix. Show that J, + A is invertible, and 
det, + A) = 1. 


5.5.11 Let A and B be complex n x n matrices such that AB — BA commutes with 
A. Show that (AB — BA)” = 0. 


5.5.12 Let A bean x n matrix. Define a map M, from M,,(F) to itself by M,(B) = 
A-B. Show that M, is a linear transformation. Relate the trace of A and that of M,. 


5.5.13 Suppose that A’ = A”. What are possible eigenvalues of A? 
5.5.14 Show that the determinant of a Hermitian matrix is always real. 


5.5.15 Show that the determinant of a skew-symmetric real matrix of odd order is 
0. 


5.5.16 Let A be n x n skew-Hermitian matrix. Show that the determinant of A is 
either 0, or it is purely imaginary if n is odd. Show that it is purely real if n is even. 


5.5.17 Show that if A is n x n Hermitian, then iJ, + A is invertible. 


5.5.18 Show that if A is skew real symmetric (skew-Hermitian) n x n matrix, then 
I, + A is nonsingular. 


5.5 Spectral Theorem, and Orthogonal Reduction 173 


5.5.19 Show that if X*AX is real for all complex vector X, then A is Hermitian. 


5.5.20 Find an orthogonal matrix O such that O'AO is diagonal, where 


101 
A= |011 
113 


Find a real symmetric matrix B, if possible, such that B? = A. 


5.5.21 Show that all eigenvalues of A*A are real, and A*A is unitarily similar to a 
diagonal matrix. 


5.5.22 Let A be a real matrix. Show that A‘A is similar to a diagonal matrix. 


5.5.23 Show that every real symmetric matrix A can be expressed as A = PDP", 
where D is a diagonal matrix. 


5.5.24 Show that every nonsingular real symmetric matrix A can be expressed as 
A = LDL’, where L is a lower triangular matrix, and D a diagonal matrix. 


5.5.25 Let A be an x n matrix with entries in R. Show that the map <, > from 
R” x R’ to R defined by < x, y > = XA y' is an inner product if and only if A is 
symmetric, and all the eigenvalues of A are positive. 
5.5.26 Which of the following matrices are positive or positive definite? 
(i) 

121 

A= {211 
113 


(ii) 


(iii) 


(iv) 


174 5 Determinants and Forms 


(v) 
211 
A= {121 
112 


For each of the above matrices, find orthogonal matrices O such that O'AO is a 
diagonal matrix. Express the positive definite matrices as BB’. 


5.5.27 Find the cube roots of the matrices in Exercise 5.5.26. 
5.5.28 Show that J + iA*A is nonsingular for all complex matrices A. 


5.5.29 Show that the matrix 


cosa sina 


—sina@ cosa 


is diagonalisable over R if and only if a = nz for some n. Diagonalize it over the 
field C of complex numbers. 


5.5.30 Show that every orthogonal 2 x 2 matrix with determinant 1 is a matrix of 
the form given in the above exercise. 


5.5.31 Show that the group O(2) is isomorphic to SO(2) x Z2. Show that the group 
SO(2) is isomorphic to the circle group S!. 


5.5.32 Show that O(3) is isomorphic to SO(3) x Zo. 


5.5.33 Let A € SO(3). Show that A — J; is singular. Deduce that 1 is always an 
eigenvalue of A. What are other possible eigenvalues of A. 
Hint. Det(A — Ih) = Det(A'(A — Ih)) = Detdh — A). 


5.5.34 Use the above exercise to show that every matrix A in SO(3) is similar to a 


matrix of the form 
1 O 0 


0 cosa sina 
0 —sina cosa 


Deduce that every matrix A in SO(3) represents a rotation in R about an axis through 
an angle a, where trace of Ais 2cosa + 1.Inparticular, deduce that—1 < 7r(A) < 3. 
This justifies the name rotation group for SO(3). 


5.5.35 Show that two orthogonal 3 x 3 matrices are similar if and only they have 
the same trace, and also the same determinant. 


5.5.36 Show that SO(3) is generated by reflections. 


5.5.37 Describe the closed subgroups of SO(3). 


5.5 Spectral Theorem, and Orthogonal Reduction 175 


5.5.38 Show that the group O(n) acts transitively on the set V (n, r) of r-dimensional 
subspaces of IR”. Describe the isotropy subgroup of the subspace W € V(n, r) con- 
sisting of vectors with last n — r coordinates 0. 


5.5.39 Show that the group O(n) acts transitively on the n — 1 sphere S’""'! = {x € 
R” | || x || = 1} in R". Describe the isotropy group O(n)z,. 


5.5.40 Consider the sphere S7 = {x € R* | || x|| = 1} in R*. Define a map dp 
from S? x S? to Rt L){0} by cosds2(x, y) = <x, y >. Show that ds: is a metric 
called the spherical metric. Use Exercise 5.5.38 to show that the map ds»-1 from 
5"! x $"-! toR* U{0} defined by cosdsn-i (x, ¥) = < X, Y > is metric. The metric 
space (S", ds») is called the spherical n-space. Describe the Geodesics (path of 
shortest distance) in S”. 


5.5.41 Consider the upper part of the hyperboloid H 2 = {¥ = (x1, x2, x3) | P| _ 
x3 —x} = 1,x1 > 0}. Show that x, ¥ € H? implies that x;y; — x2y2 — x33 > 2. 
Show that the map d;p from H? x H* to R* | ){0} defined by cosh(dy2(%, ¥) = 
X1y) — X2y2 — X33 1s a metric (called the hyperbolic metric). (How to generalize it 


for arbitrary n?). 


5.5.42 Show that every matrix in SO(3) is similar to a diagonal matrix over the field 
C of complex numbers. 


5.5.43 Show that a square matrix A with entries in the field C of complex numbers 
is similar to a diagonal matrix if and only if it is a normal matrix in the sense that 
AA* = A*A. 


5.5.44 Find the polar decomposition of the following complex matrix. 


[ii 


5.5.45 Find the polar decomposition of the following real matrix. 


bi) 


5.5.46 Find a singular value decomposition of the following matrices: 


5.5.47 The map x from R? x R? to R? defined bya x b = (anb3 — agho, ayb, — 
a1b3, a;bz — arb) is called the vector product on IR?. Show that the vector product 
is uniquely characterized by the requirement that it is bilinear, and it satisfies 
(ijaxa = 0, 


176 5 Determinants and Forms 


(ii) <axb,a>= 0 =<axb, b> foralla,be R’. 

(ili) < €] X @&, 4B >= 1. 

Further, show that a nonzero alternating map from R” x R” to R” exists if and only 
if n = 3. In particular, the concept of vector product exists only on R?. 


5.6 Bilinear and Quadratic Forms 


In this section, we discuss bilinear and quadratic forms, and their canonical reduction. 


Definition 5.6.1 Let V be a finite-dimensional vector space over a field F. A map 
f from V x V to F is called a bilinear form if f is linear in each coordinate in the 
sense that 

(i) f(ax + by, z) = af (x, y) + bf (, z), and 

(ii) f(x, ay + bz) = af (x, z) + bf (x, z) 

forallx,y,ze V,anda,be F. 


Thus, a map f from V x V to F is a bilinear form if and only if the maps f, and f” 
from V to F defined by f(y) = f(x, y) and f?(x) = f(x, y) are linear functionals. 
Further, then the maps x ~~ f,, and y ~» f” denoted by Ly and Ry, respectively, are 
linear transformations from V to V* (verify). 

The zero map from V x V to F is a bilinear form on V. Any inner product on a 
real vector space is a bilinear form. The determinant map from F* x F? to F is an 
other bilinear form on F?, where F is a field. 


Example 5.6.2 Consider the vector space F” of column vectors over F’. Let A be a 
n X n matrix over F. The map f from F” x F” to F defined by 


f(X,Y) = X'AY 


is a bilinear form on F” (verify). We shall see below that these are all bilinear forms 
on F”. In fact, given any bilinear form f on a vector space V of dimension n over 
F, there exists an isomorphism T from V to F” (corresponding to each choice of 
basis), and a matrix A such that f(x, y) = T(x)'AT(y). Thus, essentially these are 
all bilinear forms on a vector space of dimension n. 


Example 5.6.3 Let ¢ and w be linear functional on a vector space V of dimension 
n. Then the map f from V x V to F defined by f(x, y) = (x) - ~Q) is a bilinear 
form on V. Is it true that every bilinear form on V is of this form? Support. 


Let f and g be bilinear forms on V and a,b € F. Then it is easily seen that 
af + bg defined by (af + bg)(x, y) = af(x, y) + bg(, y) is a bilinear form on V. 
Further, the zero map which takes every thing to 0 is already a bilinear form. Thus, 
the set BL(V, F) of bilinear forms on V is a vector space over F with respect to the 


5.6 Bilinear and Quadratic Forms 177 


operations defined above. Let us fix an ordered basis {u, u2,..., Un} of V. Define a 
map My, u,....u, {rom BL(V, F) to M,(F) by 


yee 


Mu,w,...u,6) = layl, 


where aj = f(u;, uj). This map is a linear transformation (verify), and it is called 
the matrix representation map relative to the ordered basis {u), u2,..., Un}. The 
matrix My, u,..u,(f) 1s called the matrix representation of the bilinear form /. 


jokey: 


Theorem 5.6.4 The matrix representation map My, ,w,...,u, iS a vector space isomor- 
phism from BL(V, F) to M,(F). 


Proof It is already seen to be a linear transformation. Thus, it is sufficient to show 
that Mu, u,....u, iS bijective. Suppose that My, u,...u,f) = Muy.u,...u,(g). Then 
fui, uj) = g(ui, uj) for alli,j. Letx = Xj xjuj andy = Xi_,yju; be any two 
members of V. Then using the bilinearity of f and g, we get 


fy) = f(a xii, Uy) = VL WUE fui. upyy = 
Di yx Ui, up)yy = Vijxig(ui, ujyy = gy). 


This shows that f = g. Observe that f(x,y) = X'AY, where A = [aj] is the 
matrix representation of f with respect to the ordered basis {uj, u2,..., Un}, X and 
Y are column vectors whose ith row entries are x; and y;, respectively. Conversely, 
letA = [aj] be a matrix in M,(F). Define a map f from V x V to F by 


f(y) = X'AY, 


where x = UX? ,xjuj, y = LXi_,yju;, X a column vector with ith row entry x;, and 
Y acolumn vector with ith row entry is y;. It is easy to observe that f defined above 
is a bilinear form. Since f(u;, uj) = aj, the matrix of f is A. This shows that the 
matrix representation map is an isomorphism. ft 


Corollary 5.6.5 Let V be a vector space and {uy, uz, ..., Un} an ordered basis of 
V. Let {uy, u3,..., uy} be the dual basis. Then (fj = uj uF |l<i<n,l<j<n} 


forms a basis of BL(V, F). 


Proof The matrix representation map My, w,...,u, takes fjj to e;, and since {ej | 1 < 
i,j <n} forms a basis of M,,(F), the result follows from the above theorem. tt 


Effect of Change of Basis on Matrix Representation 


Theorem 5.6.6 The matrix representations of a bilinear form on a vector space V 
with respect to different choices of bases are all congruent 


Proof Let f be a bilinear form on V. Let {u1, u2,...,u,} and {v1, v2,..., Un} be 
ordered bases of V. Let P = [p,] be the matrix of transformation from the ordered 


178 5 Determinants and Forms 


basis {u1, U2, ..., Un} to the basis {v1, v2,..., Un}. This means that vj = U7? piui. 
Clearly, P is nonsingular. Further, suppose that 


Muu ee: un f) = [aj] 


and 


M,, v2,- tn) = — 


Then aj = f(uj, uj) and bj = f(v;, vj). Thus, using bilinearity of f we get 


bi = Fi, vj) = f (CE Pete, De Py) = Lp Pei (Lf Ue, uj)pyj)- 


This shows that b, is the ith row jth column entry of P!Mu,.w.....u, (f)P. Hence 


M,,, V2, .+45Upn ff) = P'M, Mu, U2, .+-,Un (f)P, 


where P is the matrix of transformation from the ordered basis {u;, U2, ..., Un} to 
{U1, U2,..-, Un}. t 


Definition 5.6.7 Let f and g be bilinear forms on V. We say that f is congruent to 
g if there is an isomorphism T from V to V such that g(x,y) = f(T(), T(y)) for 
allx,yeV. 


The main problem in the theory of bilinear form is the classification of bilinear 
forms up to congruence over different fields. We shall give a solution to this problem 
for symmetric (f(x, y) = f(y, x)) bilinear forms. Following theorem reduces this 
problem to the problem of classifying matrices up to congruence. More precisely, 
we need to determine a unique member from each congruence class of matrices 
and determine an algorithm to reduce a matrix to the unique representative of the 
congruence class determined by that matrix. 


Theorem 5.6.8 A bilinear form f is congruent to a bilinear form g on V if and 
only if their matrix representations corresponding to any choice of ordered bases are 
congruent. 


Proof Letf and g be congruent bilinear forms on V. Then there is an isomorphism T 
from V to V such that g(x, y) = f(T (x), T(y)) forall x, y € V. Let {u), uw, ..., un} 
be an ordered basis of V. Then {7 (u,), T(uz), ..., T(un)} is also an ordered basis of 
V. Also g(uj, uj) = f(T), Tu) for all i, 7. This shows that 


Muy .10,-.5t) 9) = Mr), 7(u2),...T un) F)- 
Since matrix of a bilinear form associated to any two ordered basis of V are congruent, 


it follows that the matrices of f and g corresponding to any choice of ordered bases 
are congruent. Conversely, suppose that My, .u.,....u,(9) = P'My,.v»,....v,P. Then 


5.6 Bilinear and Quadratic Forms 179 


guj, uj) = Dp pPRi (SL yf (ve. w)py) = f(Bxe Pave, TL yppyvyp) = f wi, wj), 


where wj = Uf_yPuivg, and wj = LX pyv;. Since P is nonsingular {w,, w2,..., 
wn} form an ordered basis of V. Thus, if T denotes the isomorphism from V 
to V which takes u; to w;, then g(uj;,uj) = f(T(uj), T(u;)) for all i,j. Since 
{u1, U2, ..., Up} is an ordered basis of V, and f and g are bilinear forms, it follows 
that g(x,y) = f(T), TQ)) for allx, y € V. t 


Example 5.6.9 We have seen above that any bilinear form on F” is given by 
f(X,Y) = XtAY = Dd) jxiayyj, where A = [aj], X is the column vector whose ith 
row is x;, and Y is the column vector whose ith row is y;. Note that the matrix of this 
bilinear form with respect to the standard ordered basis is A. Consider the bilinear 
form on R? given by 


f(K,Y) = xy. + 2x1y3 + xy + x2y3 + 2x3xy1 + x32. 
The matrix A of this bilinear form with respect to the standard ordered basis is 


012 
A=|101 
210 


From Example 2.7.5, it follows that this matrix is congruent to the diagonal matrix 
Diag(2, —5, —4), and the matrix P of transformation is 


1 
9) —2 
0 1 


Thus, the bilinear form g on R? given by 
g(X,Y) = my, — 5%2¥2 — 4x3y3 


is congruent to f. The isomorphism T from R? to itself which takes e! to the ith 
column of P is an isomorphism such that g(X, Y) = f(T(X), T(Y)) for all X, Y € 
IR. In fact, the substitution x, ~» x; — 5X2 —%3, 2~ 5X2 — 2x3, X3 ~~ x3, and 
yi ~> Yi — 3¥2 — 3, Y2 ~ 592 — 2y3, y3 ~ y3 transforms f to g. 


Theorem 5.6.10 Let f be a bilinear form on a vector space V of finite dimension n 
over a field F. Then 


p(Lr) = p(A) = p(Rp), 


where A is a matrix of f corresponding to any choice of basis, and p denotes the 
rank. 


180 5 Determinants and Forms 


Proof Let us first observe that matrices of f corresponding to different ordered bases 
are congruent, and so they all have the same rank. Because of the rank-nullity theorem, 
it is sufficient to show that v(Ly) = v(A) = v(R,), where v denotes the nullity. 
Now, 


v(Ly) = dim({x eV | Ly) = fp = O}) =dim({x eV | f(x,y) = 
0 forall y € V}). 


If we fix an ordered basis {u;, u2,...,u,} of V, then the map T from V to the 
vector space F'”, which associates tox = LX, x;u; the column vector X whose 
ith row is x;, is an isomorphism, and then f(x,y) = T(x)'AT(y) = X‘AY. This 
isomorphism takes the subspace {x | f(x, y) = Oforally € V} of V isomorphically 
to the subspace {X € F" | X'AY = Oforall Y € F"}. Thusv(Ly) = dim({X € F* | 
X'AY = Oforall Y}). Next, over any field if C isa column vector such that C'Y = 0 
for all Y € F”, thenC = O (verify). Hence {X € F” | X‘AY = OforallY € F"} = 
{Xe F"|X'A = O} = {X € F” | A'X = O}. This shows that v(Ly) = v(A’‘). 
Since A is a square matrix, v(A) = v(A). This shows that p(Ly) = p(A). Similarly, 
we can show that p(Rr) = p(A). t 


Definition 5.6.11 Let be a bilinear form on V. Then the common number p(Ly) = 
p(A) = p(Re) is called the rank of f. 


Corollary 5.6.12 Let f be a bilinear form on a vector space V of dimension n. Then 
the following conditions are equivalent. 
1. Rank of f is n. 
2. f(y) = Ofor all y implies thatx = 0. 
0. 


3. f?(x) = O for all x implies that y = d 


Definition 5.6.13 A bilinear form on V is called non-degenerate, or nonsingular 
bilinear form if it satisfies any one (and hence all) of the above three conditions in 
the corollary. 


Symmetric Bilinear Forms 


Now, we try to describe bilinear forms which has a nice diagonal representation 
f@y) = Diy GXiyi 


with expect to an ordered basis {uw}, u2,...,Un}, X = UL xjuj,andy = Ui yiuj. 
This is equivalent to say that the matrix representation of f with respect to some 
ordered basis is diagonal. Since matrix representation of a bilinear form with respect 
to any two bases is congruent, we need to characterize those bilinear forms whose 
matrix representations are congruent to diagonal matrices. If A is a matrix which 
is congruent to a diagonal matrix D, then there is a nonsingular matrix P such 
that P'‘AP = D or equivalently, A = QDQ', where Q = (P7')!. Thus, A! = 
QD'Q' = QDQ = A. This shows that A is symmetric matrix. In other words, all 
matrices associated to the bilinear form f are symmetric. This is so if and only if 


5.6 Bilinear and Quadratic Forms 181 


fy») = f(y, x) forallx, y € V (verify). Such a bilinear form is called a symmetric 
bilinear form. 


Corollary 5.6.14 A necessary condition for a bilinear form f to have a diagonal 
representation is that it is a symmetric bilinear form. ft 


Let f be a symmetric bilinear form on a finite-dimensional vector space V. A pair 
of vectors x, y € V is said to be orthogonal to each other if f(x, y) = 0. Let W be 
a subspace of V. Then Wt = {x € V | f(x,y) = O forall y € W} is a subspace 
of V (verify), and it is called the orthogonal compliment of W with respect to the 
bilinear form f. Observe that unlike the case of inner product space, W () W+ may 
be different from {0}. Clearly, kerLy = kerRr; = {xe V|f, = O} = V+UIf V is 
a vector space over the field R of real numbers, then f is called positive (negative) 
if f(x, x) => O(< 0) for all x € V. It is said to be positive (negative) definite if it 
is positive (negative), and f(x,x) = 0 if and only if x = 0. To say that f has a 
diagonal representation with respect to the basis S = {x,, xo, ...,X,} iS to say that 
S is orthogonal basis in the sense that the members of S$ are pairwise orthogonal. 

The following result is the converse of the above corollary in case the field is of 
characteristic different from 2. 


Theorem 5.6.15 Let f be a symmetric bilinear form on a finite-dimensional vector 
space V over a field F of characteristic different from 2. Then there is an orthogonal 
basis of V, or equivalently, there is an ordered basis with respect to which f has 
diagonal representation. 


Proof The proof is by the induction on the dimension of V. If dimension of V is 0 
or 1, then there is nothing to do. Assume that the result is true for symmetric bilinear 
forms on vector spaces of dimension n. Let f be a symmetric bilinear form on a 
vector space V of dimension n + 1. Iff is zero bilinear form, then there is nothing to 
do. Suppose that f 4 0. We claim that there is ax € V — {0} such that f(x, x) 4 0. 
Suppose not. Then f(x, x) = 0 for all x ¢ V. Now, since f is symmetric bilinear 
form 


faty xty)—fa—y,.x—y) = 2f,y) + 2f0,x) = 4f@,y) 


for all x,y € V. Hence 4f(x,y) = 0 for all x,y € V. Since the field is of char- 
acteristic different from 2, f(x,y) = O for all x,y € V. This is a contradic- 
tion to the supposition that f A 0. Let u; € V — {0} such that f(u, u,) #0. Let 
W = {au, | a € F} be the subspace generated by u;. Then dimension of W is 1. 
Consider Wt = {ve V | f(u;,v) = 0}. Then W+ is a subspace of V (verify). 
Suppose that au, € W+.ThenO = f@,any) = af (uy, v1). Since f (uy, u1) 4 0, 
it follows thata = 0. Thus, W ()} W+ = {0}. Further, let v €¢ V. Then 


f,v— 2 u) = flu,v)—flu,v) = 0. 


Putw = v— faa. Then w € W+, and 


182 5 Determinants and Forms 


— f(r) 
= Fu) 41 av 


belongs to W + W+. This shows that V = W @ W?-. Since restriction of f to W+ 
is also symmetric bilinear form on W+ and dimension of W+ is n, it follows that 
there is an ordered basis {u2, v3, ..., Uni} with respect to which f restricted to wt 
has diagonal representation. In other words f(u;, uj) = 0 for alli # j,i = 2 and 
j = 2. Already f(u;, uj) = 0 for all j => 2. Thus, f has the diagonal representation 
with respect to the ordered basis {u, u2,..., Un+i}. t 


Corollary 5.6.16 Let A be a symmetric matrix with entries in a field F of charac- 
teristic different from 2. Then there is a nonsingular matrix P such that P'AP is a 
diagonal matrix. ft 


Remark 5.6.17 The proof of the above theorem and the corollary is algorithmic. 
It gives an algorithm to reduce a symmetric bilinear form (symmetric matrix) over 
a field of characteristic different from 2 to a diagonal bilinear form (matrix). An 
algorithm to reduce a symmetric matrix over a field of characteristic different from 
2 congruently to a diagonal matrix is given in the proof of Theorem 2.7.2 which is 
further illustrated in Example 2.7.5. This also gives an algorithm (see Example 5.6.9) 
to reduce a symmetric bilinear form to diagonal form. 


Corollary 5.6.18 Let f be a symmetric bilinear form on a finite-dimensional vector 
space V over the field C of complex numbers (or over a field F of characteristic 
different from 2, and which contains square root of each of its elements). Then there 
is a basis {u,, Uo, ..., Un} of V such that 

(i) f(uj, uj) = Ofori Fj, 

(ii) f (ui, ui) = 1 fori <r, where r is rank of f, and 

(iii) f(uj, uj) = Ofori=>rt+l. 
More precisely, the matrix of f with respect to some ordered basis is in normal form. 


Proof From Theorem 5.6.15, we can find an ordered basis {v1, v2, ..., Up} such that 
the matrix of f with respect to this ordered basis is diagonal. Suppose that the rank of 
f isr. We may assume that the first r diagonal entries are different from 0, and the rest 
of the diagonal entries are 0. This means that f(v;, vj) = 0 fori Aj, f(v;, vi) AO 


fori <r, and f(v;, v;) = O fori > r+ 1. Since the field contains square root of 
each of its elements, we have ,/f(v;, v;) € C. Take uj = a 5 Ui for i < r, and 
uj = v; forj >r-+ 1. Clearly, the ordered basis {u, u2,..., Up} has the required 
properties. tt 


Corollary 5.6.19 Any symmetric matrix over C is congruent to a matrix in normal 


form. f 


Since any two congruent bilinear forms (matrices) have same rank, we have the 
following corollary. 


Corollary 5.6.20 Any two symmetric bilinear forms (matrices) over the field C of 
complex numbers are congruent if and only if they have same rank. tt 


5.6 Bilinear and Quadratic Forms 183 


The above corollary is not true over the field R of real numbers. J,, and —/,, have 
same rank where as they are not congruent over the field R of real numbers (verify). 
However, over the field R of real numbers we have the following results. 


Proposition 5.6.21 Let f be a symmetric bilinear form of rank r on a finite- 
dimensional real vector space. Let {uy, U2,...,Un} be an orthogonal basis of V 
with f (uj, ui) 4 Oforalli < r, andf (uj, uj) = Oforalli > r+ 1. Thenf is positive 
(negative) if and only if f (uj, u;) => 0 (< 0). It is positive (negative) definite if and 
only if f (uj, uj) > 0 (< 0) for alli. 


Proof If x = DiLyaju;, then f(x, x) = LiL, a7f (uj, uz). Since a? is always non- 
negative, the result follows. t 


Theorem 5.6.22 (Sylvester) Let f be a symmetric bilinear form on a real vector 
space (more generally over a sub field of the field of real numbers in which all 
positive members have square root). Then there is an ordered basis {uy, U2, ..., Un} 
of V, and non-negative integers r, p and q such that 

(i) f(uj;, ui) = 1 foralli < p, 

(ii) f(uj, uj) = —l foralli,p+1<i<y, 

(iii) f(u;, uj) = Oforalli > r+ 1, and 

(iv) f(y, uj) = 0 for alli # j. Further, integers r, p, and q are independent of 
choice of such bases. 


Proof From Theorem 5.6.15, it follows that there is an ordered basis {v,, v2, ..., Up} 
such that f(v;, vj) = Ofori Aj, and f(v;, v;) = O for alli > r +1, where r is the 
rank of f. Changing the order of {v;, v2, ..., Un}, we may assume that f(v;, v;) > O 
fori < p, and f(v;,v;) < Oforp+1<i<r. Take u; = esa fori<r, 
and u; = v; for alli > r+ 1. Then it is clear that {w,, u2,...u,} has the required 
property, where g = r-— p. It remains to show that r, p and q are independent 


of the choice of basis. Since r is the rank of f, and which is invariant (congruent 
matrices have same rank), it is independent of the choice of basis. It is sufficient to 


show that p is independent of the choice of basis. Let {v;, v2,..., v,} be an other 
ordered orthogonal basis such that f(v;, v;) = 1 fori < p’, and f(v;, v;) = —1 for 
p' +1 <i<,r.Itisclearfrom the above proposition that V+ has {u,+1, Upi2,--+5 Un} 
and {v;+41, Up+2,-+-, Un} as bases. We show that 

x = {v1, U2, -++5 Up's Up+1, Up+2, +e, Uy, Ur+1, a) Un} 


is linearly independent. Suppose that 


QV] + AQV2 + + Ap! Up! + Apz1Up+ + Ap+2Up42 10 1 App + Appi Uypp Hts Fann = 0. 


Thenu + v + w = 0,whereu = ayvy + dg¥2 +--+ + Gy Up, V = App tUpy, + 
Ap+2Upy2 +++ +G,-Uu,, and Wo = apy Uyp41 + p42Ur42 + +++ + Agu. Clearly, 
flu, u) => 0, fv, v) < 0, and f(x, w) = 0 for all x € V(this is because w € V+). 
Thus, 


184 5 Determinants and Forms 
0=fuutvut+w) = fuwt+fu,u+fu,w) = fu,w +f, v), 
and similarly, since f is symmetric, 
0 = fv,ut+tvt+w) = fv, v) +f, v). 


From the above two equations, f(u,u) = f(v,v). Since f restricted to the sub- 
space generated by {v;, v2,...,U,'} 1S positive definite, and f restricted to the 
subspace generated by {Up+1, Up42,...,U,} iS negative definite, it follows that 
fu,u) = 0 = f(v, v). This, in turn, implies that uv = 0 = v, andsoalsow = 0. 
Since {v1, 02,.-., Up}, {Up41, Upy2,---, Ur}, and {U;-41, Ur+2, ..., Un} are linearly 
independent, it follows that X is linearly independent. Hence n — p+ p’ <n, and so 
p’ <p. Interchanging the role of the bases, we see that p < p’. The result follows. tf 


Remark 5.6.23 pis the largest among the dimensions of subspaces of V over which f 
is positive definite. Similarly, g is the largest among the dimensions of the subspaces 
of V over which f is negative definite. 


Corollary 5.6.24 Every real symmetric matrix is congruent to a unique diagonal 
matrix of the form 


Didg Novy le ST Oy Ore hes OY: : 
a, ee ee et” 
Pp q n—p—-q 


Since for any r < n, there are r + 1 pairs of non-negative integers p, q such that 
Dp+q = r, we have the following corollary. 


Corollary 5.6.25 There are ist distinct congruence classes ofn x n real sym- 
metric matrices. tt 


Definition 5.6.26 If f (A) is a real symmetric bilinear form (matrix), the uniquely 
determined integer p — gq is called the signature of f (A). 


Real Skew-Symmetric Forms (Matrices) 


Recall that a bilinear form f is skew-symmetric if f(x, y) = —f(y, x). It follows 
that f is skew-symmetric if and only if its matrix with respect to any basis is skew- 
symmetric. If the field F is of characteristic 0, then f(x,x) = Oforallx eV. 


Proposition 5.6.27 Letf be a skew-symmetric bilinear form on a finite-dimensional 
vector space V over a field of characteristic 0. Suppose that f (x, y) 4 0. Then {x, y} 
is linearly independent. Further, if z € W, where W is the subspace generated by 


(x.y) thenz = FES — Fe?” 
Proof Suppose that ax + by = 0. Then 0 = f(ax+by,x) = af(x,x)+ 
bf(y,x) = —bf(x, y). Since f(x,y) 40, b=0. Similarly, 0 = f(ax+ by, y) = 


af (x,y) + bf, y) = af (x, y). Again, since f(x, y) 4 0, a= 0. This proves that 


5.6 Bilinear and Quadratic Forms 185 


{x, y} is linearly independent. Next, if z = ax-+ by, then f(z,x) = af(x,x)+ 


bf(y,x) = —bf (x, y). Thus, b = —72%. Similarly,a = £2. Hence 


— fay, _ fG*) 
<= Fay — fay?” d 


Proposition 5.6.28 Under the hypothesis of the above proposition V = W ® Wt. 


Proof Letv € Vandw = Fant = fey, Then 


fe—w,x) = fo, +FSf0,9) = fx) —f@,x) = 0. 


Similarly, f(v — w, y) = 0. This shows that v— w € W+. Thus V = W + Wt. 
Suppose that v = ax+bye W?. Then 0 = f(v,x) = —bf(x,y), and 0 = 
fv,y) = af (x, y). Since f(x, y) € 0, it follows thata =O=b.Hencev=0. f 


Theorem 5.6.29 Let f be a skew-symmetric bilinear form on a finite-dimensional 
vector space V over a field F of characteristic 0. Then there is a non-negative integer 
r, and an ordered basis 


{u1, V1, U2, V2,..., Ur, Ur, Urt1, Ur42,---5 Un} 


of V such that 
(i) fui, vi) = 1 = —f(vi, ui) for alli <r, 
(ii) fuji, uj) = O = f(y, vj) for alli, j and 
(iii) f (uj, vj) = O for alli Fj. 


Proof The proof is by the induction on the dimension of V. If dimension of V is 0, 
there is nothing to do. If dimension of V is 1, then also any skew-symmetric bilinear 
form on V is 0, and there is nothing to do. Assume that the result is true over all 
vector spaces of dimension less than n. Let V be a vector space of dimension n, and 
f askew-symmetric bilinear form on V. Iff is 0, then there is nothing to do. Suppose 
that f A 0. Then there exists u,, v; € V such that f(u;, v;) A 0. Multiplying by a 
suitable scalar to u;, we may suppose that f(u;, v1) = 1. From above results, it 
follows that {uw , vj} is linearly independent, and if W is the subspace generated 
by {u, vi}, then VV = W@® we. Clearly, the dimension of W+ is n — 2, and the 
restriction of f to W+ is skew-symmetric. By the induction hypothesis, there exists 
an ordered basis {u, V2, U3, U3, -. +5 Ups Ups Upts Up, «++ Un} Of W+ such that 

G) f(@u;,v;) = 1 = —f(v;, u;) for alli,2 <i<r, 

(ii) f(uj;, uj) = O = f(vj;, v;) for alli, 7 > 2 and 

(ili) f(u;, vj) = Oforalli Aj, i,j = 2. 
Clearly, {U1, V1, U2, V2, ..+5 Ups Ups Up t, Upy2,---,Un} has the — required 
properties. ft 


Corollary 5.6.30 The rank of a skew-symmetric bilinear form on a vector space 
over a field of characteristic 0 is always even. ft 


186 5 Determinants and Forms 


Following is the matrix form of the theorem. 


Corollary 5.6.31 Every n x n skew-symmetric matrix A with entries in a field F of 
characteristic 0 is congruent to a matrix of the form 


0 1000000 0 0000 
—-10 0 00000 0 0000 
000 10000 0 0000 
0 0-100000 0 0000 


where 2r is the rank of A. t 


It is clear that there is no nondegenerate skew-symmetric bilinear form on a vector 
space of odd dimension. 

Suppose that f is anondegenerate skew-symmetric bilinear form on a vector space 
V of even dimension 2r over a field of characteristic 0. Arranging the basis vectors 
Uj, V1, U2, V2,.-.., Un, Vy Of the theorem as Wj, U2,..., Un, V1, U2, --., Vy and looking 
at the matrix representation, we get the following corollary. 


Corollary 5.6.32 Every nonsingular 2n x 2n skew-symmetric matrix with entries 
ina field F of characteristic 0 is congruent to a matrix of the form 


0, J 
—J 0; |? 


where 0, is then x n zero matrix and 


000001 
000010 


100000 


Quadratic Forms, Orthogonal Reduction 


Let V be a finite-dimensional vector space over a field F. A map q from V to F is 
called a quadratic form, if there is a bilinear form f on V such that g(v) = f(v, v) 


5.6 Bilinear and Quadratic Forms 187 


for all v € V. If f is skew-symmetric, then g(v) = f(v,v) = 0. We assume that 
the field is of characteristic different from 2. Then every bilinear formf = f; + fis, 
where f, is symmetric and f,, is skew-symmetric. But, then g(v) = f,(v). Thus, for 
every quadratic form q, there is a symmetric bilinear formf such that g(v) = f(v, v) 
for all v € V. The following proposition says that the symmetric bilinear form is 
uniquely determined by the quadratic form. 


Proposition 5.6.33 Let q be a quadratic form corresponding to a symmetric bilinear 
form f. Then 
fv, w) = 4qvt+w) -— qv—w)). 


Proof fvu+tw,v+w) — fv-w,v-—w) = 4f(v, w). t 


A quadratic form on a vector space V with respect to an ordered basis {u;, u2,..., 
Un} is given by giv) = %; ;xjayjxj, where aj = f(uj, uj), f being the symmetric 
bilinear form representing g, and v = Dj_,x;u;. The following two results follow 
from the corresponding results on symmetric bilinear forms over the field C of 
complex numbers, and over the field R of real numbers. 


Corollary 5.6.34 Let q be a quadratic form on a vector space V over the field C of 
complex numbers. Then there is an ordered basis {u,, U2, ..., Un} of V such that the 
representation of q with respect to this basis is 


qu) = xf +p t+ + x, 


where r is the rank of q (rank of q is defined to be the rank of corresponding f ), and 
v= Ly XiVi. tt 


Corollary 5.6.35 Let q be a quadratic form on a real vector space V. Then there is 
an ordered basis {u,, U2, ..., Un} of V such that 


2 2 2 2 2 2 
qv) = xy + xp +--+ + x, X41 Xp 42 vee xe, 


where r is the rank of q, and 2p — r is the signature of q(signature of q is defined to 
be the signature of the corresponding symmetric bilinear form). tt 


Example 5.6.36 Consider the bilinear form f on R* given in Example5.6.9. Its 
reduced diagonal form and the matrix of transformation P is given in that example. 
Clearly, the form is further congruent to x,y; — x22 — x33, and the correspond- 
ing matrix is congruent to diag(1, —1, —1). The rank is 3, and the signature is —1. 
The matrix of transformation is given by diag(=. ./2, 2)P (check it). Let q be the 


quadratic form on R? given by 
Q(X) = 2xyx. + 4x1x3 + 2x2x3. 


It can be seen easily that the symmetric bilinear form of q is the bilinear form given 
in Example 5.6.9. The congruent reduction of g to the normal form is 


188 5 Determinants and Forms 
2: 2: 2 
q(X) = x7 — xy — x3. 


The matrix of transformation is as above. One can obtain the ordered basis of R? 
using the matrix of transformation with respect to which the quadratic form is in 
reduced form as given above. 


Using Corollary 5.5.17 (orthogonal reduction), we get the following proposition. 


Proposition 5.6.37 Let qg be a quadratic form ona real inner product space V. Then 
there is an orthonormal ordered basis {uy, uo, ..., Un} of V such that 


qv) = Mae + doxs ftoeee +H Kae. 


where v = Xt, xjuj, and Aj, A2,..., An are eigenvalues of the matrix of q corre- 
sponding to a basis of V. tt 


Remark 5.6.38 Corollary 5.5.17 gives an algorithm to find an orthogonal transfor- 
mation which reduces the given quadratic form to the diagonal form. This reduction 
is called an orthogonal reduction. 


Surfaces in R* Represented by the Equations of Second Degree 


A general equation of second degree representing a surface in R? is given by 


fQY.Z) = 
ax? + by* + cz + 2hxy + 2fyz + 2gxz + 2ux + 2vy + 2wz +d = 0. 


Let us denote the column vector 
x 


y 
z 


by X. Consider the quadratic form g on R? given by 
q(X) = ax? + by” + cz? + 2hxy + 2fyz + 2gxz. 


Then f(x, y,z) = q(X) + 2ux + 2vy + 2wz + d.The matrix A of the quadratic 
form g is given by 
ahg 
A= |hbf |, 
gf ec 


and q(X) = X'AX. Using Corollary 5.5.17, we can find orthogonal matrix O such that 


O'AO = diag(A,, A2, 3). PutX’ = O'X.ThenX = OX’. Substituting X = OX’, 
the quadratic form reduces to the form 


Q(X’) = Ax? + Apy? + 32”. 


5.6 Bilinear and Quadratic Forms 189 


Suppose that 
lym, ny 
O= lh m2 nz 
1 m3 n3 


The fact that O is orthogonal means that the rows represent direction cosines of three 


perpendicular axes, and the columns also represent the direction cosines of three 
perpendicular axes. Further, then f(x, y, z) reduces to 


fe, y,z) = x7 ie doy” 4 N32? 4 Ou! x! 4+ 2u'y’ a Qw'7’ Bh d, 
where uv = uly) +vlo + wh, vo = um, +um+wm3, w!’ = un, + vno + wns. 


If the quadratic form g is nondegenerate, then all \; are nonzero, and then making 
perfect squares f(x’, y’, z’) it reduces to 


MQ! + £) + oO + 2)? + ait ZY +d, 


where d’ = d—(#)? — (£)? — (E)’. Substituting x” = x + $, y’ = vt 


yy? 2 = 2 +X, the equation reduces to 


yx? a doy” as Nez = =A" 


If d’ = 0, then it represents a cone. If all A; together with d’ are positive, then such 
a surface does not exists. Suppose that all A; are positive, and d’ is negative. Then it 
represents an ellipsoid with center given by x” = y” = z” = O, and the principal 
axes given by the lines x” = 0, y’ = 0, z” = 0. Expressing x”, y’, and z” in terms 
of x’, y’, z’, and then, in turn, expressing x’, y’, z’ in terms of x, y, z with help of the 
orthogonal transformation O, we get center, principal axes, and principal planes in 
terms of original coordinate systems. 

If two of 4; are positive, and the other is negative, and also d’ is negative, then 
it represents a one-sheeted hyperboloid whose invariants can be obtained as above. 
If one of them is positive, and other two are negative, then it represents two-sheeted 
hyperboloid. 

Next, suppose that 43 = 0. Then the above equation will reduce to Ax"? + 
doy” = 2az", orto yx"? + Ary’? = a. Incase | it represents elliptic paraboloid 
if both A;, Az are positive, and it represents hyperbolic paraboloid otherwise. Further, 
in case 2 it represents elliptic cylinder, or hyperbolic cylinder. If two eigenvalues are 
0, then it reduces to the form x’ = 4y”, or to the form x’? = a. In case | it 
represents parabolic cylinder, and in case 2 it represents pair of parallel planes. 

We illustrate the above discussion by means of an example. 


Example 5.6.39 Consider the second-degree equation 


a? py the -xztx—-1=0. 


190 5 Determinants and Forms 


The matrix of the quadratic form associated to this equation is the matrix A of 
Example 5.5.19. Its eigenvalues are 1, 1, 2. If we transform the equation using the 
orthogonal transformation O (see Example 5.5.19), the equation is transformed to 


x? + y? + 22? + ax! — ay = 1. 


Completing the square it reduces to 


/ t «2 , 1 \2 (2... § 
(x + 5%) +O Sag! + 2z = 4 


This represents ellipsoid with axes = vS /5. The center is given by x! = 
mee y= sya and z’ = 0.SinceX = OX’, substituting the values we get that 

x= —i, y= =F Ai z= —4. The principal planes are given by x’ = =57 =y, 

and z’ = 0. Using the transformation X' = O'X, we see that the principal planes 

are a i si Pe xyR and oe = 

Exercises 


5.6.1 Show that f(X,Y) = xy, + 2my, + 3x1y2 + x2y3 + 4x3y3 defines a 
bilinear form on R?. Find its matrix representation with respect to the standard basis, 
and also with respect to the ordered basis {e; + e2, e2 + e3, e; + e3}. Conclude that 
these two matrices are congruent to each other. Is this bilinear form symmetric? Find 
its rank. Is this nondegenerate? 


5.6.2 Let V = M,(F) denote the vector space of all n x n matrices with entries 
in F. Define a map f from V x V to F by f(A, B) = Tr(A‘CB), where C is a fixed 
matrix. Is this symmetric? Find its rank in terms of the matrix C. 


5.6.3 Determine which of the following define a bilinear form on R?. 
MAX, Y) = xp + yp + myo + 3y3. 
(ii) f(X, Y) = 2 forall X,Y eR’. 
(iit) F(X, Y) = x1x2 + yiy2 + x13. 
(iv) f(X, Y) = x1y3 — xoy2 + x32. 


5.6.4 Let V be the vector space of 3 x 3 matrices over R and 


123 
A= |224 
345 


Define a map f from V x V toR by f(X, Y) = Tr(X‘AY). Show that f is a bilinear 
form. Is it a symmetric bilinear form (observe that A is symmetric). Find the matrix 
of f relative to the ordered basis {@1;, €12, €13, €21, €22, €23, 31, €32, €33}, Where ej; is 
the matrix whose ith row jth column entry is | and the rest of the entries are 0. Find 
its rank. 


5.6 Bilinear and Quadratic Forms 191 


5.6.5 Let V be as above. Define amapf from V x V to Rby f(A, B) = Tr(AB) + 
Tr(A)Tr(B). Show that it is a symmetric bilinear form. Find its rank and signature. 


5.6.6 Show that a bilinear form on V is product of linear functionals if and only if 
it is of rank 1. 


5.6.7 Let f be a nondegenerate form on a finite-dimensional vector space, and g a 
bilinear form. Show that there exists a unique linear transformation 7, on V given 
by g(v, w) = f(T; (v), w), and also there exists a unique linear transformation 7 
on V such that g(v, w) = f(v, To(w)). 


5.6.8 Reduce the following symmetric bilinear forms on C? congruently to the 
normal form, and find the matrices of transformations. 

(i) f (X,Y) = xyyo + ixyy3 + xy + 1x3). 

(i) (X,Y) = A+ xy + x13 + %3y1 + ixay3 + x32. 

(iii) A(X, Y) = x1y3 + xoy3 + x3y. + 232. 


5.6.9 Check if the following pairs of bilinear forms on C? are congruent. 
(i) (f, 9). 
(il) (g, A). 
(iii) (f, A), 


where f, g, h are bilinear forms defined in Exercise 5.6.8. 


5.6.10 Reduce the following complex symmetric matrix congruently to a matrix in 


normal form. 
12 i 


24 7 
i7—-i 


5.6.11 Reduce the following symmetric bilinear forms over R* congruently to nor- 
mal form. Find the matrix of transformation in each case, and also rank and signatures: 
(Wf(X, Y) = 2xyy1 + 3x1y2 + x1y3 + 3xoy1 + Xoy2 — Xoy3 + X31 — Xo. 
(ii) 9X, Y) = xiyo + xoy3 + xy, + x32. 
(iii) A(X, Y) = x1y3 — xoy2 + x3). 


5.6.12 Check if the following pairs of real symmetric bilinear forms are congruent. 
@) (Ff, 9). 
(ii) (g, A). 
(ii) (f, h), 


where f, g, 4 are as in the above exercise. 


5.6.13 Find the ranks and signatures of the following matrices by congruently reduc- 
ing them in to the standard canonical forms. 


192 5 Determinants and Forms 


(i) 
102 


246 


(ii) 
123 
B= | 234 
345 


(iii) 
345 

C= 1456 

567 


5.6.14 Determine which of the following pair of matrices are congruent. 
(i) (A, B). 
(ii) (A, C). 
(iii) (B, C), 

where A, B, C are as in the above 


5.6.15 Can we have a nondegenerate skew-symmetric bilinear form on a complex 
vector space of dimension 3? Support. 


5.6.16 Find the number of congruence classes of skew-symmetric bilinear forms 
on a real vector space of dimension 3. 


5.6.17 Show that any two nondegenerate skew-symmetric 2n x 2n real matrices are 
congruent. 


5.6.18 Reduce, orthogonally, the following quadratic forms in to standard canonical 
form. 

@yv+2t+ yr tam — x. 

(ii) 4x? + By? + 227 + 4yz — 4xy. 

(ili) xy + yz + zx. 


5.6.19 Find the eigenvalues of the real symmetric matrix A, and also an orthogonal 
matrix O such that O'AO is diagonal, where 


2 2°05 
A= | -2-1 10 
5 10 —22 


5.6.20 Reduce the following surfaces in to standard form, find their nature, and also 
their invariants such as center, principal axis, and principal planes. 


5.6 Bilinear and Quadratic Forms 193 


@y + 2 + yz +m — xy — 2x + 2y — 22 +1= 0. 

(ii) 4x” + 3y? + 22? + 4yz — 4xy — 4x — Oy — 82 —-6=0. 

(iii) 3x7 + 6y? + 112? + 8yz t+ 1l0x+x-ytz-4=0., 
5.6.21 Let f be a nondegenerate bilinear form on a vector space V. Let O(f) 
denote the set of all linear transformations T which preserve f in the sense that 
fU@),TO)) = f@, y). Show that O(f) is a group. 
5.6.22 The bilinear form < >; on R”*! defined by 


< X,Y Sn = MY + X22 FF XnYn — Xnt1Yn+1 


is called the Lorentz inner product. Show that < >; is a symmetric nondegenerate 
bilinear form of signature n — 1. The transformations preserving Lorentz inner prod- 
uct are called the Lorentz transformations. Show that the set O(n, 1) of all Lorentz 
transformations form a group under composition of maps. This group is called the 
Lorentz Group. 


5.6.23 A set {i, U2, ..., U;} is called Lorentz orthonormal set if < 4, uj >, = 0, 
and < uj, 4; >, = +1. Use the Selvester’s law to show that at most one 7 will be 
such that < uj, Wj >, = —1. 


5.6.24 Show that every Lorentz orthonormal set is linearly independent. Show also 
that any Lorentz orthonormal set can be enlarged to a Lorentz orthonormal basis. 


5.6.25 A (n+ 1) x (n+ 1) matrix A is called Lorentz matrix if AJA' = J, where 
J = Diag(,1,..., 1, —1). Show that a linear transformation is a Lorentz transfor- 
mation if and only if its matrix representation with respect to any Lorentz orthonormal 
basis is a Lorentz matrix. 


5.6.26 Show that the determinant of a Lorentz matrix is +1. 


5.6.27 Call a Lorentz matrix a positive Lorentz matrix if < A@,a7, @:4; > > 0. 
Show that the set PSO(n, 1) of positive Lorentz matrices of determinant | is a group 
under the product of matrices. This group is called the special positive Lorentz group. 


5.6.28 Let A € PSO(1, 1). Show that there is a unique x > 0 such that 


Ho ae se 


sinhx coshx 


Describe the group PSO(1, 1). 


5.6.28". Try to describe the geometry of Lorentz transformations. More explicitly, 
show that any matrix A € PSO(n, 1) is similar to a matrix of the form 


Bo Om-=1)x2 
Orx~—-1) CC , 


where B € PSO(n — 1),andC € SOC, 1).Compare with the geometry of orthogonal 
transformation. 


Chapter 6 
Canonical Forms, Jordan and Rational 
Forms 


In the previous chapter, we studied congruence classes of matrices over some special 
type of fields. This chapter is devoted to describe similarity classes of matrices with 
entries in some special type of fields. For the purpose, we first introduce the concept 
of a module over a ring, and obtain the structure theory of modules over a principal 
ideal domain. The reader is referred to Algebra 1 for the definition and some basic 
properties of rings. 


6.1 Concept of a Module over a Ring 


A module over a ring R is a structure obtained by replacing a field F in the definition 
of a vector space over F by a ring R. Thus, 


Definition 6.1.1 Let R be aring with identity 1. A left R-module is an abelian group 
(M, +) together with a map - from R x M to M (the image of (a, x) under - is denoted 
by a- x) such that 


(i) (a+b)-x a-x + b-x 
(i) a-(«~+y) =a-x+a-y 
(iii) (ab)-x = a-(b-x) 

(iv) l-x = x 


for alla,b € Randx,yeM. 


In the similar manner we can define right modules. 
We also say that M is a left(right)R-module or M is a left(right) module over R. 


Remark 6.1.2 If a left R-module structure on M is such that (ab)-x = (ba)-x 
for all a,b € R, and x € M, then this left R-module M can also be viewed as a 
© Springer Nature Singapore Pte Ltd. 2017 195 


R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_6 


196 6 Canonical Forms, Jordan and Rational Forms 


right R-module by defining x-a = a-x. In particular, if R is a commutative ring, 
then every left R-module can also be considered as a right R-module. In this case we 
simply say that M is a R-module. 


A module over a field F is simply a vector Space over F’. 

We shall develop the theory of left modules. The theory of right modules can be 
developed on the same lines. 

Let M bea left R-module. Leta € R. Define a map f, from M toM by fa(x) = a-x. 
Sincea-(x+y) = a-x + a-yforallx, yeEM, fa € End(M, +). Thus, we have 
amap f from R to End(M, +) given by f(a) = fy. Since 


faro(®) = (at+b)-x = ax + b-x = fax) + fo) = (fa + fo), 
and 
Sap(x) = (ab)-x = a-(b-x) = falfo®)) = (Gfaofr)) 
for all a, b € R, and x € M, we see that 
fla+b) = f@ + f(b), and f(ab) = f(aof(b) 


for alla, b € R. Also 


fM@) = fi@) = 1l-x =x = In@) 


for allx € M. Hence f(1) = Ty, the identity of the ring End(M, +). It turns out that 
every left R-module M gives rise to a ring homomorphism f from R to End(M, +) 
defined by f(a)(x) = a-x. 

Conversely, given any ring homomorphism f from R to End(M, +) (a ring homo- 
morphism is assumed to preserve identity), (M,+) becomes an R-module with 
respect to the external product - given by a- x = f(a)(x), and this in turn, induces 
the same homomorphism f. Thus, a left R-module can be viewed as a representation 
of R in a ring of endomorphism of an abelian group. 

Similarly, a right R-module determines and is determined uniquely by an anti- 
homomorphism f from R to End(M, +) in the sense that f(ab) = f(b)of (a) for all 
a,ber. 

Let M be a left R-module and a € R. Then the map f, is group homomorphism 
from (M, +) to itself. Thus, 


a-0 = fi(0) = 0, anda: (—x) = fa(—x) = —(fax)x) = —a-x 
forallae Randx eM. 


Also, since the map f : R —> End(M, +) defined by f(a) = f; is a homomor- 
phism, f(0) = fo is the zero map. Thus, we have 


6.1 Concept of a Module over a Ring 197 
O-x = fox) = 90, 
and 
(—a)-x = f-a@) = —fa(x) = (-a-x) 
foralla € Randx € M. 


Remark 6.1.3 Unlike in vector spaces, a-x = Oneed notimply that[a = Oorx = 
0]. If in a module this implication holds, then we say that the module is torsion free 
module. We say that it is torsion module if for each x ¥ 0, there is an element 
a#0inR such that ax = 0 


Example 6.1.4 Every abelian group is naturally a Z module. It is a torsion-free 
module if and only if every nonzero element is of infinite order, and it is a torsion 
module if and only if all elements are of finite order. 


Example 6.1.5 Let R be a ring with identity. Then (R, +) is a R-module. Here - is 
the ring multiplication. 


Example 6.1.6 Let R be a ring with identity. Let R” denote the n times Cartesian 
product of R with itself. Thus, R° = {(aj,d2,...,d,) | a € R}. Clearly, R” is 
an abelian group with respect to the coordinate-wise addition. Define the external 
operation - from R x R” to R” by 


a- (a1, d2,...5 An) = (A+ a1,4-a2,...,4+ pn). 
Then (R”, +) is a left R-module. It can also be made a right R-module. 


Definition 6.1.7 Let M be a left R-module. A subset N of M is called a submodule 
of M, if 

(i) N is a subgroup of (M, +) 

(ii) The map - from R x M to M induces a map from R x N to N. In other words 
a-xé€Nforallae Randx EN. 


Thus, a submodule is a module at its own right. 


Remark 6.1.8 If we consider a ring R with identity as left(right) module over itself, 
then the submodules, by definition, are the left(right) ideals of R. 


The proofs of the following propositions are imitations of the corresponding 
results in vector spaces. 


Proposition 6.1.9 Let M be a left R-module. Then a nonempty subset N of M is a 
submodule if and only ifax + by €N foralla,b € R, andx,y EN. ft 


Proposition 6.1.10 Intersection of a family of submodules is a submodule. tt 


198 6 Canonical Forms, Jordan and Rational Forms 


Proposition 6.1.11 Let N, and N, be submodules of a R-module M. Then N, J Nz 
is a submodule if and only if Ny © Nz or Nz © Nj. tt 


Proposition 6.1.12 Let N; and Nz be submodules of a left R-module. Then Ny + 
Ny = {x+y|x eM, y € No} is also a submodule (called the sum of N, and N>) 
which is generated by N, J No. tt 


Proposition 6.1.13 Union of a chain of submodules is a submodule. tt 


Let M be a left R-module, and S a subset of M. Then the smallest submodule 
of M containing S exists, and it is in fact the intersection of all submodules of 
M containing S. This submodule is called the submodule generated by S or the 
submodule spanned by S, and it is denoted by < S >. If < S$ > = M, then we 
say that S generates M, or S is aset of generators of M. A module M is said to be 
finitely generated if it has a finite set of generators. 

Thus, < @ > = {0}. 


Definition 6.1.14 Let S be a nonempty subset of a left R-module M. An element 
x € M is called a linear combination of members of S if 


X = AX, + 42x. + ++ + AyXn 


for some a], d2,...,d, € Rand x1,%,...,%, € S. 


Proposition 6.1.15 Let S be a nonempty subset of a left R-module M. Then < S > 
is the set of all linear combination of members of S. 


Proof Imitate the proof of the corresponding result in vector space case. tt 
A subset S' of a left R-module M is called independent if 
(jo €5S, 


and 
(ii) given a finite subset {x, x2...,X,} of S with x; A x;, fori Aj, 


AX, + AnxX2 +--+ + ayX, = O implies that ajx; = 0 for alli. 


A subset which is not independent is called a dependent set. 


Definition 6.1.16 A subset S of a module M is called linearly independent if given 
a finite subset {x), x2, ---Xn} of S, x; A x; fori Fj, 


AX, + nx. +--+ + a,X, = O implies that a; = 0 for alli. 


A subset S' which is not linearly independent is called a linearly dependent subset. 


Clearly, a subset of a linearly independent(independent) set is always linearly 
independent(independent). 


6.1 Concept of a Module over a Ring 199 


Proposition 6.1.17 Every linearly independent subset is independent. 


Proof Let S be a linearly independent subset. Then 0 ¢ S, for otherwise 1-0 = 0, 
whereas | 4 0. Next, since a; = 0 forall i implies that a;x; = 0 for all i, the result 
follows. tt 


Remark 6.1.18 An independent subset need not be linearly independent subset. For 
example, consider the Z-module Zs. Then {2, 3} is independent but not linearly 
independent(verify). 


Since in a vector space a;x; = 0 and x; 4 0 implies that a; = 0, we have the 
following proposition. 


Proposition 6.1.19 A subset S of a vector space V is linearly independent if and 
only if it is independent. tt 


Proposition 6.1.20 Union of a chain of linearly independent(independent) subsets 
is linearly independent(independent). 


Proof Let {S, | a € A} be a family of linearly independent(independent) subsets. 
Then0O ¢ S, forall a, andhence0 ¢ oe Sq. Let x1, X2,..., Xp, be distinct elements 
Cer S.. Suppose that x; € Sq,. Since {S, | a € A} is a chain, there exists a, such 
that Sy, © Sq, for all i. Thus, x), x2,...,X, all belong to S,,. Since Sq, is linearly 
independent(independent), 


AX, +.gX2 +--+ + a,X, = O implies that a; = O(a;x; = 0) for alli. 


This shows that the union is linearly independent(independent). tt 


Proposition 6.1.21 Every linearly independent(independent) subset can be embed- 
ded in to a maximal linearly independent(independent) subset. 


Proof Let S be a linearly independent(independent) subset of a module M. Let X 
be the set of all linearly independent(independent) subsets which contain S. Then 
X # U, for S € X. Thus, (X, C) is a nonempty partially ordered set. From the 
previous proposition, it follows that every chain in X has an upper bound. By the 
Zorn’s Lemma X has a maximal element T (say). Clearly T is also maximal linearly 
independent(independent). tt 


Remark 6.1.22 A maximal linearly independent subset of a module may be far from 
a set of generators. For example the additive group (Q, +) is a Z-module. Every 
singleton {a}, « A 0 is a maximal linearly independent: givena = 7, and = A 
pna — qm = 0. We also know that it has no finite set of generators. However, 
we have the following proposition. 


Proposition 6.1.23 Let S be a maximal independent subset of a left R-module M. 
Let x € M. Then there exists a € R — {0} such that ax €< S > 


200 6 Canonical Forms, Jordan and Rational Forms 
Proof Ifx € S,then lx = xe SandS C< S$ >.Ifx = 0,then0 = 1-0e<S>. 
Suppose x # 0, and x ¢ S. Since S is supposed to be a maximal independent 
subset, S (J {x} is dependent. Thus, there exist a, a),-+-@, in R, and x1, x2, +--+, in 
S,x; Ax; fori Aj such that 

Ox + ax, +a2x2 + +++ + AnX, = 0, 
where not all of ax, Q,X,, Q2X2, ... , Q@yXy, are zero. If ax = O, then 


QyX, + AX. Fores + AX, = 0, 


where not all of a1%1, Q2X2,..., A,X, are 0. This contradicts the supposition that S 
is independent. Thus, ax # 0, and 


ax = —Q 1x1 a2X2 ee AnXn 


belongs to < S >. Sinceax £0, a £0. tt 


Proposition 6.1.24 Let M be a left R-module and S a set of generators for M. If T 
is a subset of M such that S is properly contained in T, then T is dependent. 


Proof \f0 € T, then there is nothing to do. Letx «¢ T—S, x 4 0.Since< S$ >= M, 
there exists x1, %2,...,%,) € S, x; A x; fori Aj and aj, a2,..., a, € R such that 
xX = ax, + 09%. +--+ +a,X,. But, then 


Ix + (—a))x1 + (—a2)x2 + +++ + (—Qn)Xn = O, 


where 1x # 0. Hence T is dependent. tt 


Definition 6.1.25 Let M be a left R-module. A subset S of M is called a 
minimal or irreducible set of generators of M if < S > = M, and no proper 
subset of S' generates M. 


Remark 6.1.26 A set of generators need not contain any minimal set of generators. 
In fact, a module need not have any minimal set of generators. For example, the 
Z-module (Q, +) does not have any minimal set of generators. However, we have 
already seen that this is true in vector spaces. 


Direct Sum of Modules 


Let M,, M>,...,M, be left R-modules. Then M@ = M, x M> x --- x M, is an 
abelian group with respect to the coordinate-wise addition. If we define the external 
multiplication - by a- (41, %2,...,%,) = (AX), a%2,..., ax,), then M becomes a 
left R-module. This is called the external direct sum of M,, M>,...,M,. 

A module M is said to be direct sum(internal direct sum) of its submodules 
M,, M2,...,M,, if every element x of M has a unique representation as 


6.1 Concept of a Module over a Ring 201 
X= xX +X. tees + Xp, 

where x; € M; for all i. The notation 
M=M,06™.60::-@M, 


stands to say that M is direct sum of its submodules M,, M2, ..., M-. 


Proposition 6.1.27 Let M,, M2,...,M, be submodules of a module M. Then the 
following conditions are equivalent. 

(JM =M, ®M, @.--- © M.. 

(2) (i). M = M, + M, + --- + M,, and 
(ii). Mi 1) M' = {0} for all i, where Mi denotes the submodule (M, + My +---+ 
Mi-1 + Miz + Miz. +--+ +M,). 


Proof 1 => 2. Assume 1. Since every element x of M has a unique representation 
asx = x; +x. +---+4x;, 2(i) is evident. Let x ¢ M;() M‘. Thenx = x; +x) + 
te bX + X41 + X42 +-+-xX;-, where x;, 7 A i belong to Mj. Thus, 


O = mtx. te $x HX tM +2 ++++ +4, = OF04+---4+0. 


From the uniqueness of the representation of 0, it follows that each x; is zero, and so 
x is also 0. 

2 => 1. Assume 2. Since every element of M is sum of elements of M,, Mo, ..., 
M,, it follows that every element x of M has arepresentationx = x;+x2.+---+x,, 
where x; € M; for all i. Now, we prove the uniqueness of the representation. Suppose 
that 


KS MPMI PX = Vir yz + Ves 


where x;,y; € M; for all i. Then x; — y; € M; (\M' = {0}. This shows that 
x; = y;foralli. t 


Remark 6.1.28 If M is direct sum of M,, M>,...,M,, then M’ as defined in the 
above proposition is direct sum of M,, Mo, ..., Mi-1, Mi+1, Mi42,..., My. 


Quotient Modules 


Let M be a left R-module, and N a submodule of M. Then N is a subgroup of (M, +). 
Consider the quotient group 


M/N = {x+N|x eM}. 
Here the cosetx +N = {x+2n]|n € N}, and the addition is defined by 


(x+N)+(V+N) = (x+y)4N. 


202 6 Canonical Forms, Jordan and Rational Forms 
Define the external multiplication -: R x M/N —> M/N by 
a:-(x+N) = (a-x)+N. 


Then M/N is a left module, and it is called the quotient module of M modulo N. 


Remark 6.1.29 In general, submodule of a finitely generated module need not be 
finitely generated. For example, the polynomial ring Z[X,, X2, . . .] over Z in infinitely 
many variables is a module over itself which is generated by the identity of the 
ring, whereas the submodule generated by the set of all indeterminate is not finitely 
generated(verify). 


Module Homomorphisms, Isomorphisms 


Let M, and M; be left R-modules over a ring R. A map f from M, to Mz is called 
a R-module homomorphism if f(ax + by) = af(x) + bf(y) for all a,b € R and 
x, y € M,. A bijective homomorphism is called an isomorphism. The proofs of the 
following results are imitation of the proofs of the corresponding results in vector 
space theory, and are left as simple exercises. 


Proposition 6.1.30 Let f be a homomorphism from a left R-module M, to a left 
R-module M>. Then, 


(i) FO) = 0, 

(ii) f(—x) = —f(x) forallx eM, 
(iii) the image of a submodule of M, under the map f is a submodule of Mz, and 
(iv) the inverse image of a submodule of M) under the map f is a submodule of M,. 


tt 


In particular, f~'({0}) is a submodule of M,, called the kernel of the homomor- 
phism, and it is denoted by kerf. 


Proposition 6.1.31 A homomorphism f is injective if and only if kerf = {O}. 4 


Theorem 6.1.32 (Fundamental theorem of homomorphism). Let f be a homomor- 
phism from a left R-module M, to a left R-module Mo. Let N, be a submodule of M,. 
Then there exists a homomorphism f from M,/N, to Mz such that fov = f if and 
only if Ni © kerF. Also if such a homomorphism exists, then it is unique. Further, 
then f is injective if and only ifN, = kerf. Finally, f is an isomorphism if and only 
iff is surjective, and N, = kerf. tt 


Theorem 6.1.33 (Noether isomorphism theorem). Let Ni and Nz be submodules of 
a left R-module M. Then (N\_ + N2)/N>2 is isomorphic to N,/N, (\ No. tt 


Proposition 6.1.34 Let M, and Mj be left R-modules and f a homomorphism from 
M, to M). Then, 


(i) f is surjective if and only if it takes a set of generators to a set of generators. 
(ii) f is injective if and only if it takes an independent set to an independent set. { 


6.2 Modules over P.I.D 203 


6.2 Modules over P.I.D 


In this section, we shall be mainly interested in finitely generated modules. Let M 
be a finitely generated R-module. We say that a finite subset S = {x1, x2,..., Xn} 1s 
a basis of M if every element x € M can be written uniquely as 


X = AX, + nx. + +++ + AnXn- 


This amounts to say that S generates M, and S is linearly independent. 

A R-module M is said to be a free module over R if it has a basis. 

Thus, every vector space V (a module over a field) over a field F is a free F- 
module. This is not true for modules over an arbitrary ring. For example, Z> is a 
Z4-module but not free Z4-module. 


Proposition 6.2.1 Let M be a free R-module with S as a basis. The every map f 
from S to a R-module N can be extended uniquely to a homomorphism from M to N. 


Proof Let S = {x1,x2,...,X,} be a basis of M, and f a map from S to N. Since 
every element of M can be written uniquely as a)x; + doxX2 +--+ + ayXn, f can be 
extended to a map f from M to N defined by 


Sf (ayxy + aox2 +++ + GyXn) = arf 1) + Gof 2) + +++ + anf On): 


Clearly, f is a homomorphism which extends f. It is also clear that the definition of 
f is forced, and so it is unique. tt 


Corollary 6.2.2 A R-module M is free with a basis containing n elements if and 
only if it is isomorphic to R". 


Proof The R-module R" has {e1, é2, .. . én} aS abasis, where e; is arow withn columns 
in which i column is 1 and the rest of the columns are 0 (This basis is called the 
standard basis). Thus, R” is a free R-module. Since an isomorphism takes a basis to 
a basis, any isomorphic image of R” is a free R-module with a basis containing n 
elements. Conversely, if M is a free R-module with a basis {x1, x2, ..., X,}, then the 
map which takes x; to e; extends to an isomorphism from M to R”. ft 


Remark 6.2.3 Unlike in the case of vector spaces, for arbitrary ring R, R” isomorphic 
to R” does not imply thatn = m. For example, if we take R to be the ring of 
endomorphism on an infinite dimensional vector space V, then R? is isomorphic to 
R (verify). However, if R is a P.I.D., then R” isomorphic to R” implies thatn = m. 
The proof of this fact will follow soon. 


Proposition 6.2.4 Let M be a R-module, and N a submodule such that M/N is free. 
Then M is isomorphic to N @ M/N. 


204 6 Canonical Forms, Jordan and Rational Forms 


Proof LettT = {x1 + N,xo + N,...,x-; +N} be a basis of M/N. Then since 
1X1 +a2x2+---+a,x, = Oimplies that aj (x; +N)+a2Q2+N)+:--+4,-(4%-+N) = 
N (the zero of M/N), and since T is a basis of M/N, it follows that a; = 0 for 
all i. Thus, S = {x,,x2,...,x;} is a linearly independent subset of M. Let L 
be the submodule of M generated by S. Then L is free with S as a basis, and the 
map x; ~» x; + N extends an isomorphism from L to M/N. Let x € M. Then 
X+N = ay(4, +N) +a. +N)+---+a,(x%, +N) for some aj, a2,...,a, € R. 
But, thenx — (a,x; + dox2 +---+a,x,) belongs to N. Hence M = N +L. Next, 
suppose that y+ z = 0, where y € NandzeL.Thenz+N = z+y+N = N, 
and since z ~ z+ N is an isomorphism from L to M/N, it follows that z = 0. In 
turn, y = 0. This shows thatM@ = N@LXYNOQM/N. 


Proposition 6.2.5 If L and N are free submodules of M such thatM = L@®N, 
then M is a free module. 


Proof Suppose that S; is a basis of L, and S> is a basis of N. Then S; J S2 is linearly 
independent, and also generates M (verify). tt 


Recall that a commutative ring is a principal ideal domain (P.I.D.) if it is integral 
domain, and every ideal of R is of the form Ra for some a € R. Note that Ra = Rb 
if and only if a and b differ by a unit in R. For further details see the Algebra 1. 


Theorem 6.2.6 Let R be a principal ideal domain, and M a free R-module with 
a finite basis containing n elements. Then every nonzero submodule of M is a free 
module with a basis containing at most n elements. 


Proof The proof is by the induction on n. Ifn = 1, then M =< x, >= Rx, 
where x; # 0, and ax; = O implies that a = 0. The map a ~ ax, is clearly a 
R-isomorphism from R to M. Therefore, it is sufficient to show that every nonzero 
submodule of R considered as a module over R is free. A nonzero submodule of R 
is an ideal of R, and it is of the form Ra, where a 4 0. Clearly, Ra is free with {a} as 
a basis. 

Assume that the result is true forn < m. Let M be a free module with S = 
{x1,X2,..-,Xm} as a basis of M. Then the submodule < x; > generated by x, is 
free with {x} as a basis. Consider the quotient module L = M/ < x, >, and the 
quotient map v. Since S generates M, v(S) generates L. Since v(x) is the zero of 
L, it follows that S’ = {v(x2), v(x3)..., V(%m)} also generates L. We show that S’ 
is linearly independent. Suppose that 


a2V(X2) + a3V(x%3) + +++ + AnVXm) =< xX >. 
Then 


A2X2 + 3X3 F Ay Xm F< Xp > SH <>, 


6.2 Modules over P.I.D 205 
or equivalently, 
2X2 + 3X3 + FAnXm = —A,X1 
for some a; € R. But, then 
AX; + dex. + +++ + dnXm = 0. 


Since S is linearly independent, each a; = 0. This shows that L is free, and S’ is 
a basis of L. By the induction hypothesis, every submodule of L = M/ < x1 > 
is free with a basis containing at most m — | elements. Now, let N be a nonzero 
submodule of M. If v(N) is zero submodule of L, then N is a nonzero submodule of 
< x; >, and so from the previous case, it is free with a singleton basis. Suppose that 
v(N) is nonzero, and so it is free with a basis containing at most m — | elements. 
The restriction /N of v to N is a surjective homomorphism from N to v(N) whose 
kernel is N() < x; >. Since N/N() < x; > is a submodule of M/ < x; >, it 
is free with a basis containing at most m — 1 elements. The result follows from the 
Propositions 6.2.4 and 6.2.5. tt 


Remark 6.2.7 In fact every submodule of a free R-module is free. The proof uses 
transfinite induction. 


Z-modules are abelian groups, and free Z-modules are called free abelian groups. 
Since Z is a P.I.D., we have the following corollary. 


Corollary 6.2.8 Every subgroup of a finitely generated free abelian group is free. tf 


Proposition 6.2.9 Every finitely generated module over R is isomorphic to quotient 
of a free module. 


Proof If M is generated by {x1, x2,...,X,}, then the map f from R” to M defined by 
F(Q1, @, ++ +5 An) = AX, + anX2 +--+ + ayXp, iS a Surjective homomorphism. Since 
R" is free, the result follows from the fundamental theorem of homomorphism. #f 


Corollary 6.2.10 Every submodule of a finitely generated module over a P.1.D. is 
finitely generated. 


Proof LetM = < {x1,%2,...,%,} > bea finitely generated module. Then the map f 
from R” to M defined by f(a), dz, ..., Gy) = AX, +aoxX2 +--+ +a,Xy iS a surjective 
homomorphism from the free module R” to M. Let N be a submodule of M. Then 
f '(N) is a submodule of R". Since R" is a free module with a basis consisting of n 
elements, it follows that f~'(N) is a free submodule of R” with a basis containing 
at most n elements. Hence N = f(f—!(N)) is also generated by at the most n 
elements. tt 


A module M is called a cyclic module if M is generated by a single element. 
Thus, M is cyclic if there exists an element x € M such that MM =<x>= Rx. 


206 6 Canonical Forms, Jordan and Rational Forms 


The map f defined by f(a) = ax is a surjective homomorphism from R to M. Since 
R is cyclic over R, and homomorphic images of cyclic modules are cyclic modules, 
it follows that a module M is cyclic if and only if it is homomorphic image of R. 
Suppose that R is PI.D., and M = Rx acyclic module. Let N be a submodule of 
M. Then f~!(N) is a submodule of R, and so it is an ideal Ra for some a € R. Thus, 
f7'(N) is cyclic. Hence N = f(f~!(N)) is cyclic. This shows that submodule of a 
cyclic module over a P.I.D. is cyclic. Is this assertion true if R is not a P.I.D.? 

Let M be a R-module, where R is a P.I.D. Let x € M. Consider the map f from R 
to M given by f(a) = a-x. Thenf is clearly a R-homomorphism. The kernel of f 
is an ideal of R. Since R is a principal ideal domain, kerf = Ra for somea é€ R. If 
a = 0, then the kernel of f is {0}, and in this case the submodule < a > generated 
by ais isomorphic to R. In this case we say that x is of period 0. Thus, x is of period 
0 if and only if ax = O implies thata = 0. Such an element is also called a 
torsion free element of M.If kerf = Rais nonzero, thena 4 Oandax = 0. In this 
case x is called a torsion element, and a, where kerf = Ra, is called a period of x. 
If a and b are periods of a torsion element x, then Ra = Rb, and soa ~ b. Thus, 
period of a torsion element is unique up to associates. It is clear that a is a period of 
x in M if and only if bx = 0 ifand only if a/b. A period of an element x is denoted 
by o(). 

Suppose that M@ = < x > is acyclic module generated by x. If x is of period 
0, then M is isomorphic to R, and if a period of x is a € 0, then M is isomorphic 
to R/Ra. Thus, a cyclic R-module is isomorphic R, or it is isomorphic to R/Ra for 
some a 0. In case of abelian groups (Z-modules), period corresponds to order of 
the element. 


Definition 6.2.11 A module M over a ring R is called a torsion module if every 
element of M is a torsion element. It is said to be torsion free, if every nonzero 
element of M is torsion free. A module which is neither torsion nor torsion free is 
called a mixed module. 


Every finite abelian group is torsion Z-module. A torsion Z-module is also called 
a periodic group. The additive group Z of integers is torsion-free Z-module. 


Proposition 6.2.12 Let M be a R-module, and let T(M) denote the set of all torsion 
element of M. Then T(M) is a torsion submodule of M, and M/T(M) is torsion-free 
module. 


Proof Suppose that x, y € T(M), anda, b € R. Since x, y are torsion elements, there 
exist nonzero elements c and d such that cx = 0 = dy. Clearly, then cd 4 0, and 
cd(ax + by) = 0. This shows that ax + by € T(M). Thus, T(M) is a submodule of 
M. Next, let x + T(M) be a nonzero element of M/T(M). Thenx + 7T(M) 4 T(M). 
This means that x is not a torsion element of M. Suppose that a(x+7T(M)) = T(M). 
Then ax € T(M). Hence there exists b ~ 0 such that bax = 0. Since x is torsion 
free, ba = 0. Again, since b 4 0, a = 0. This shows that M/T(M) is torsion free. 
ft 


6.2 Modules over P.I.D 207 


Definition 6.2.13 T(M) is called the torsion part of M, and M/T(M) is called the 
torsion free part of M. 


If M is finitely generated over a P.I.D., then so are T(M) and M/T(M). 
Theorem 6.2.14 Every finitely generated torsion-free module over a P.L.D. is free. 


Proof LetS = {x1,x2,...,X,}beaset of generators of M. We may assume without 
any loss that each x; 4 0. Since M is torsion free, {x;} is linearly independent. Let T 
be a maximal linearly independent subset of S. Without any loss, we may suppose 
that T = {x,,%2,...,x,}, r <n. Let N be a submodule of M generated by T. Then 


N is free. Since T is maximal linearly independent, {x,, x2, ...,x;, X,+;} is linearly 
dependent for alli, 1 < i <n-—vr. Hence, there are aj, do,..., d;, 4-4; in R not all 
O such that 

AX] + AnX2 + +++ + a,X, + G,4iX4i = O. 


Since T is linearly independent, a,.; 4 0, for otherwise each a; = 0. Also d,4;X;4; = 

AX, — A2X, — --- — a,x, belongs to N. Leta = d,414;42-+:@,. Thena 4 0, 
and ax; € N for all i. Since S generates M, ax € N for all x € M. Define a map f 
from M to N by f(x) = ax. Clearly, this is a module homomorphism. Since M is 
torsion free, and a £0, ax = Oimplies that x = 0. This means that f is injective. 
Thus, M is isomorphic to a submodule of NV. Since N is free with a finite basis, and 
submodule of a free module with finite basis is free (Theorem 6.2.6), it follows that 
M is free. tt 


Corollary 6.2.15 If M is a finitely generated module over a P.I.D., then Mo = 
T(M) ® M/T(M). 


Proof Since M is finitely generated, M/T(M) is finitely generated and torsion free. 
From the above theorem M/T(M) is free. From Proposition 6.2.4, it follows that 
M = T(M)@M/T(M). tt 


Since every finitely generated free module over R is isomorphic to R” for some 
n, we have the following corollary. 


Corollary 6.2.16 Every finitely generated module over a P.I.D. is isomorphic to the 
direct sum of a finitely generated torsion module and R" for some n. tt 


Since every finitely generated torsion abelian group is finite, we have the follow- 
ing: 
Corollary 6.2.17 Every finitely generated abelian group is isomorphic to the direct 


sum of a finite abelian group with Z" for some n. tt 


Thus, to study the structure of finitely generated modules over principal ideal 
domains, it is sufficient to study the structure of finitely generated torsion modules 
over principal ideal domains. 


208 6 Canonical Forms, Jordan and Rational Forms 


Proposition 6.2.18 Let M be a finitely generated torsion module. Then there exists 
a #0 such that ax = Oforallx € M. 


Proof Suppose that M = < {x1,%0,...,X,} >. Let a; be a period of x;. Then 
ajx; = 0. Leta = aj,az---a,.Thena 40, andax = Oforallx eM. tt 


LetA = {ae R|ax = Oforall x € M}. Then A is an ideal of R, and since R is 
PI.D.,A = Rm for some m € R. Such a m is called an exponent of M. It is clear 
that exponent of M is unique up to associates. 

Let M be a torsion module over R, and p a prime element of R. We say that M is 
a p-module if given any element x € M, there exists n € N such that p”-x = 0. 

Let M be a torsion module and p a prime of R. LetM, = {x ¢ M | p"x = 
0 for some n € N}. Then M, is a submodule of M, and it is called the p-part of M. 


Theorem 6.2.19 Let M be a torsion module, and a an exponent of M. Let {p,, 
P2,++++Pn} be a set of primes dividing a such that p; and p; are not associate for 
i # j, and also each prime divisor of a is an associate of p; for some i. Then 


M = My, ®Mp, ®---®Mp,. 


Proof Let x € M. Since a is exponent of M, period of x divides a. We may suppose 
that 


ti oh if 


O(xX) = Pi Pz ***P?- 


“Then (41, 92,--+,Gn) ~ 1.Since RisaP.L.D., there exist u;, uz, ..., Un 


in R such that 


Let qj = 


wig: + u2g2 + +++ + Ungn = 1. 
Hence 
X = Wygix + U2qaX + +++ + UngnX. 
Now, p*ujgix = ujpg;x = uj;o(x)x = 0. This shows that ujqix € M,,. Hence 
M = My, + My, + --- + Mp,. 
Further, suppose that 
Xp + %+-:- +x = 0, 


ty oh i-1 ti+1 


where x; € M,,. Suppose that o(x;) ~ Bis Let gi = P}Po °° “pep ---p™. Then 
O = gilx, +x. +++: +4) = gix;. Since (pi, gi) ~ 1, there exists u;, v; such 
that uip; + viq; = 1. Hencex; = UipiXi + viqgix; = 0. This shows that the 


6.2 Modules over P.I.D 209 


representation of an element x as sum of elements of M,, is unique. Hence M is the 
direct sum M,, ® M,, ®--- ® Mp,. tt 


Now, we describe the structure of finitely generated p-modules, where p is a prime. 
First observe that if M is a torsion module generated by {x,, x2,...,x,}, then the 
exponent of M is l.c.m. of o(%1), o(x2), ...0(%,). Thus, if M is a p-module generated 
by {x1,X2,...,Xn}, where o(x;) ~ p™, and m is the maximum of n;, then p” will be 
an exponent of M. 


Theorem 6.2.20 Let M be a finitely generated p-module over a P.I.D.(p a prime). 
Then M is direct sum 


<x >B<x2>O::-PB<Xy > 


Ni 


of cyclic modules, where 0(x;) ~ p", ny > no >-++ = Nm. 


Proof (The proof is the imitation of the proof of the Theorem 7.3.1 of the Algebra 
1) Let M be a p-module generated by {x), x2, ...,Xm}, where x; # O for all i. The 


proof is by the induction on m. If m = 1, then M = < x, >, and then there is 
nothing to do. Assume that the result is true for m = r. We prove it for r + 1. Let 
S = {x1,%2,...,%,41} be a set of generators for M, where x; 4 0 for all 7. Suppose 


that o(x;) ~ p”. We may assume that n, > nz > --- > n,4,. Thus, p” is an exponent 
of M. Consider the quotient module M/ < x; >, and the quotient map v from M to 
M/ <x, >. Then M/ < x; > is generated by {v(xz), (x3), ..., (X41) }. Clearly, 
M/ <x, >isap-module of exponent p’, where t < n;. By the induction hypothesis, 


M/ <x; > =< V0Q2) > ® < v3) > @:--@ < vs) > 


for some y2, y3,..., ¥s in M such that o(vQ,)) = p™, m >m >n3 >--- > Ns. 
We show that there exists z; € M for all i > 2 such that v(z;) = vj), and 
o(z) = o(vQi)) = o(v(z%)). Since p'y; = 0 implies that p'(v(y;)) = < x, > 
(the zero of M/ < x, >), it follows that o(v(y;)) divides o(y;). Since o(v(Qyj)) = 
p™, p''v(y;) = < x, >. This means that p”y; €< x, >.Suppose that p"'y; = p"a;x,, 
where (p,a;) ~ 1, and?¢; < nm. Ift; = mj, then p”y; = 0, and o(j;) divides p™. 
Hence o(y;) ~ p™, and then there is nothing to do. Suppose that tf; < m,. Since 
(aj, p"') ~ 1, there exist u, v € R such that ua; + vp"! = 1. But, thenx; = ua;x;. 
Hence o(ajx;) = o(x}) = p™. Thus, o(p"y;) = o(p'x,) = p™—". This shows that 
o(y;) = pu", Since p™ is exponent of M, it follows that n, —t; +n; < n,. Hence 
nj < t;. Take z; = y;—p' "'ax;. Thenv(z;) = vj), ando(z;) = p™ = o(v(z)). 
Now, we show that 


M =<xy>@<no>@<3Br>O@:::O@<Z>. 


Let x € M. Since {v(z2), v(z3),..., V(Zs)} generates M/ < x, >, it follows that 
V(X) = av(Z2) + a3v(z3) +--+ + dsv (zs) for some az, a3, ..., ds in R. Hence 


210 6 Canonical Forms, Jordan and Rational Forms 


x 222 323 as Asfs = Aix] 


for some a, € R. Thus, 


X = aX, + AZ + A373 + +++ + AsZs. 
Next, suppose that 
Ax, + a2z%. + 4373 + +++ + as%s = 0. 
Then 
a2V(Z2) + a3V(Z3) +++ + asvQ%s) =< xX >. 
Since 


M/ <x, > =< V(2) > @ < V(z3) > @:--@® < v(Zs) >, 


ay (z) = < x; > for alli > 2. But, then o(z;) = o(v(z;)) divides a; for all i > 2. 
Hence a;z; = 0 for alli > 2. In turn, a;x; is also 0. Thus, every element x can be 
written uniquely as 


xX = wi + W2 +++: + Ws, 
where w, €< x} >, w; €< Zz > fori > 2. tt 
Combining the above results, we obtain the following: 


Corollary 6.2.21 Every finitely generated module M over a P.I.D. is isomorphic to 
direct sum of finitely many cyclic modules, some of them isomorphic to R, and some 
of them isomorphic to R/Rp" for different primes p and for different n € N. tt 


Proposition 6.2.22 Let R be an integral domain. Then R" is isomorphic to R" as 
R-modules if and only ifn = m. 


Proof Let {e,, e2,..., @m} be the standard basis of R”. Let f be an isomorphism from 
R” to R". Let F be the field of quotients of R. Then f can be extended to a vector 
space homomorphism f from F” to F” by 


Flaiey + ager + +++ + Amem) = aif (er) + arf (er) + +++ + Gnf (En). 


It is clear that f is injective. Hence m < n. Similarly, considering f—! we see that 
n<m. tt. 


The proof of the following proposition is straightforward verification. 


6.2 Modules over P.I.D 211 


Proposition 6.2.23 Let R be a principal ideal domain. Then two R-modules M and 
M' are isomorphic if and only if T(M) is isomorphic to T(M'), and M/T(M) is 
isomorphic to M'/T(M’). tt 


It follows from the Proposition 6.2.22, that there is a unique n € N such that 
M/T(M) is isomorphic to R”. This 7 is the rank of M. The following proposition is 
also easy to observe. 


Proposition 6.2.24 A finitely generated torsion module M over a PD. is isomor- 
phic to M’ if and only if M, is isomorphic to M,, for all prime p. tt 


The proof of the following proposition is also an imitation of the proof of Theorem 
7.3.3 of Algebra 1. 


Proposition 6.2.25 Let M and M' be finitely generated p-modules. Suppose that 
M =<x,>@<x>@:::O®<Xn>, 

where o(xj) ~ p", 11 > 12 > +++ Tm, and 
M =<y,>@®<y.>O@---® <yn >, 


where o(yj) ~ p', 81 > 82 > +++ > Sy. Then M is isomorphic to M' if and only if 
m = nandr; = 5; for alli. tt 


Proof Suppose that m = n and r; = s; for all i. Then < x; >~ R/Rp”™ ~< y; > 
for all i. Further, if P ~ P’ and Q © Q’, thenP@®O ~& P’@ Q’. Thus, < x, >< 
yp >< xX) > @ <x >< yy > @ < yp >. Proceeding inductively, we find that 
M is isomorphic to M’. The proof of the converse is by the induction on max(m, n). 
If max(m,n) = 1, then M = < x, > is cyclic of exponent p", and M’ = < y, > 
is cyclic of exponent p*'. Since isomorphic modules have same exponent, it follows 
that r, = s,. Assume that the result is true for max(m,n) = m,n < m. Let 


M =<x >@O<x>O---O <Xm41 >, 
where o(x;) ~ p", m1) > 2 > +++ Tm41, and 

M =<y>@®<y>O--O<yy >, 
where k < m+ 1, 0()j) ~ p", 81 = 82 => +++ = sx be isomorphic modules. Let o be 
an isomorphism from M to M’. Clearly, exponent of M is p”, and the exponent of 
M’ is p®'. Since isomorphic modules have same exponents, it follows that r; = 51. 


Since o is an isomorphism p" = o(%) = o(a(x1)) = o()1). Suppose that 


a(x1) = Biyr + Baya + ove+ + Oey oe (6.1) 


212 6 Canonical Forms, Jordan and Rational Forms 


Since o(a(x1)) = p", o(Gjy;) = p" for some. After rearranging, we may assume 
that o(@;y,) = p”. We show that 


M =<o0(%1)>@O<yw>@:-O<yVN>. 
Since p" is an exponent of M’, and 0(3,y,) divides o(y,), it follows that o(y,) ~ 
o(B1y1) ~ p™. Hence the (1, p"') ~ 1. Since R is a principal ideal domain, there 


exist u, v € R such that 


uf, + vp" = 1. 


Hence 

yi = upiy, = ula) — foy2 — Pays — +++ — Beye). 
Since {y,, y2,..., ye} generates M’, {o(x1), y2,..-, yx} also generates M’. Next, sup- 
pose that 


d:o(%1) + doy2 + +++ + Oey, = O. 
Substituting the value of o(x,) from (6.1), we find that 
OiPiy1 + (6182 + d2)y2 +--+ + iP + Ody = 9. 
Since 


M =<y>O<w>O--OK<yH>, 


d1Biy1 = (6182 + d2)y2 = +--+ = (O18e + On)ye = 0. 


Since 0((,y1) = p", p” divides 6. Hence 6;y; = 0 forall j > 2. In turn, djy; = 0 
for all j > 2. Consequently, 6,0 is also 0. This shows that 


M =<o0(%})>@O<yw>@:-O<yH>. 


Since o is an isomorphism from M to M' such that o(< x} >) = < a(x) >, it 
induces an isomorphism from M/ < x, > to M'/ < a(x) >. Clearly, 


M/ <x, >¥<x > O<x3> O--- P< X41 >, 
and 


M'/ <o0(%1) >¥<y2>O<y3>@O--Ox<My>. 


6.2 Modules over P.I.D 213 


By the induction assumption, m+ 1 = k,andr; = s; for alli. tt 


If {m,, m2,...,m,} is a set of pairwise co-prime members of R, then it follows, 
from the Chinese remainder theorem, that R/Rm is isomorphic to R/Rm,; @ R/Rm2 © 
---@R/Rm,. Using this fact, and the above results, we obtain the following theorems. 


Theorem 6.2.26 Let M be a finitely generated torsion module over R, where R is a 
PLD. Then there exists an ordered set {a,, a2, ..., a,} of elements of R such that a; 
divides aj, for alli, and M is isomorphic to 


R/Ra, ® R/Raz ®--- ® R/Ra,. 


Further, such an ordered set {a,, az, ..., a;} is unique in the sense that if M is also 
isomorphic to 


R/Rb, ® R/Rby ® --- ® R/Rby, 


where b; divides bj+, for all j, then t = s and a; is an associate of b; for all i. tt 


Theorem 6.2.27 Let M be a finitely generated module over a principal ideal 
domain R. Then there exists a nonnegative integer n together with an ordered set 
{a,, dy, ..., a} of elements of R such that a; divides a;,, for all i, and M is isomor- 
phic to 


R" ® R/Ra, ® R/Ran ® --- @ R/Ra;. 


Further, n and the ordered set {a,, a2, ..., a,;} is unique in the sense that if M is also 
isomorphic to 


R” © R/Rb ® R/Rb> ® --- @ R/Rbg, 


where b; divides bj, for all j, thenm =n,t = s, and a; is an associate of b; for all 
i, tt 


Exercises 


6.2.1 Describe all torsion abelian groups of exponent 24 which are generated by at 
the most three elements. 


6.2.2 Describe all torsion R[X]- modules which are of exponents (X — 1)?(X?+1)*, 
and which are generated by at the most two elements. 


6.2.3 Describe the torsion modules of exponent (X*+ 1)? which cannot be generated 
by less than three elements. 


214 6 Canonical Forms, Jordan and Rational Forms 


6.3 Rational and Jordan Forms 


Let V be a finite-dimensional vector space over a field F, and T be a linear transfor- 
mation on V. Then V becomes a F[X]-module with respect to the external operation 
- defined by f(X)- vu = f(T)(v) (verify that it is indeed a F[X] module). This 
module will be referred as F[X] module associated to the linear transformation T. 
By the Cayley Hamilton theorem ®7(T) = 0, where ®7(X) is the characteristic 
polynomial of T. Hence ®7(X)-v = ®7(T)(v) = 0. Thus, V (being finite dimen- 
sional)is a finitely generated F[X] module which is torsion module. Since F'[X] is 
a P.LD., using the structure theory of finitely generated torsion module over P.I.D. 
developed in the previous section, we study the linear transformation T by looking 
at its matrix representation with respect to suitable bases. Let mr (X) be an exponent 
of the module V. Then m7(T) = 0, and whenever f(T) = 0, mr(X) divides 
f (X). In particular, mr(X) divides ®7(X). If mr(X) is assumed to be monic (leading 
coefficient 1), then m7(X) is unique, and it is called the minimum polynomial of 7. 


Proposition 6.3.1 Let T, and T) be linear transformations on V. Then T, is similar 
to T> if and only if the F[X] module V associated to T, is isomorphic to the F[X] 
module associated to T». 


Proof Suppose that T, = PT,P~', where P is a nonsingular linear transformation 
on V. Then given any polynomial f(X) € F[X], we have f(T2) = Pf(T;)P7!. It 
follows that P is in fact a module isomorphism from the module V associated to 7> 
to the module associated to T;. Conversely, suppose that P is an isomorphism from 
the F[X] module V associated to 7; to the F[X] module associated to T,. Then P 
is clearly a nonsingular linear transformation on V, and P(7T\(v)) = P(x-v) = 
x-P(v) = T2(P(v)) for all v € V. Hence 7, = PT,P7!. tt 


Let T be a linear transformation on a finite-dimensional vector space V over a 
field F such that the associated FX] module V is cyclic module generated by v € V. 
Then it is clear that the period o(v) of v is precisely the minimum polynomial m7(X) 
of T. Thus, we have an F[X] module isomorphism 7 from F[X]/F[X]mr(X) to the 
F[X] module V given by n(f(X) + F[X]mr(X)) = f(X)-v = f(T)(v). Suppose 
that 


mK): = a + ik 4k? Sood a a 


Every element of F[X]/F[X]mrz(X) is uniquely expressible as r(X) + F[X]mr7(X), 
where 7(X) is a polynomial of degree at most n — 1. Thus, {1 + F[X]mr(X), X + 
F[X]|mr(X),...,X"~! + F[X]mr(X)} is a basis of the vector space F[X]/F[X] 
mr(X) over F. Since 7 (being a F[X] module isomorphism) is a F-isomorphism, 
and 7(X! + F[X]mr(X)) = T'(v), it follows that {v, T(v), T?(v),..., T’~!(v)} is 
a basis of V. Since m7(T) = 0, we have 


To) = —aqd — MT) — we?’ wy) — ss — aa). 


6.3 Rational and Jordan Forms 215 


Hence the matrix representation of T with respect to this ordered basis is clearly 
000---0 —ap 


100---0 -a 
010---0 —a 


000 50a 
000---1 ay; 
The above matrix is termed as the companian matrix of the polynomial 
iy aX + GX? os bg 4, 
Definition 6.3.2. A matrix A with entries in a field F is said to be in rational 


canonical form if there exists an ordered set {f (X), fo(X), ...,f(X)} of polynomials 
such that f;(X) divides fj; (X) for all i and 


A; 0-::- 0 

0 A2:-- O 
A 1 . 

0 0..:-A, 


where A; is the companion matrix of f;(X), and each 0 is zero matrix of appropriate 
order. 


Using the Theorem 6.2.26, and above discussion, we obtain the following theorem. 


Theorem 6.3.3. Let T be a linear transformation on V. Then there is a basis of V 
such that the matrix of T with respect to this basis is in rational canonical form. { 


Corollary 6.3.4 Every square matrix with entries in a field F is similar to a unique 
matrix in rational canonical form. tt 


The following example illustrates as to how to find/reduce a matrix having all its 
eigenvalues in the field to a matrix in rational canonical form. We chose a simple 
triangular matrix for convenience of the computation. 


Example 6.3.5 Consider the matrix A given by 


122 
A= |011 
002 


216 6 Canonical Forms, Jordan and Rational Forms 


The eigenvalues of A are 1, 1, 2, and the characteristic polynomial ¢4(x) of A is 
given by d4(x) = (x-— 1)?(x — 2). The minimum polynomial mz, (x) is a divisor of 
the characteristic polynomial having all eigenvalues as roots. Thus, the possibilities 
for ma(x) are (x — 1)?(x — 2) and (x — 1)(x — 2). Since (A — D(A — 21) £0, 
it follows that ma(x) = ¢a(x) = (x — 1)?(x — 2). Now, R? is a R[x] module 
associated to the matrix A. The exponent of this module is the minimum polynomial 
ma(x). The primes dividing the exponen are x — | with multiplicity 2, and x — 2 with 
multiplicity 1. The x — 1 part Re 1 Of the module is given by Rea = {veR | 
(x-1)?-0 = 0} = (0 | A—D?v' = 6} = {(v1, v2, v3) | vs = 0} = R?x {0}. 
It can be easily checked that this is a ae submodule generated by (1, 1, 0). This 
submodule is isomorphic to R[x]/R[x](« — 1)”. Further, the x —2 part Ri.) is again 
the null space {(4a, a, a) | a € R} A — 21, and it is a cyclic submodule of R? 
isomorphic to R[x]/R[x](x — 2). Indeed, since (x — 1)”, and (x — 2) are co-prime, R? 
itself is a cyclic module (for example generated by (5, 2, 1)), and it is isomorphic to 
the module R[x]/R[x]m, (x). Thus, the rational form of A is the companion matrix 
of the minimum polynomial m,(X) of A. As such the rational form of the matrix is 
given by 


To get the matrix P such that PAP™! is the rational form of A, we need to find the 
matrix P of transformation from the standard basis to the basis {v’, Av’, Ay! }, where 
v = (5, 2, 1) is the generator of the module. Indeed, {v', Ad’, A?0'} are the columns 
of the matrix P. 


Let T be a linear transformation on a finite-dimensional vector space V over a 
field F. Suppose that mp(X) = (X — A)", where \ € F, and F[X] module V 
associated to T is a vote module ae that (X — A) is a prime element of F'[X]) 
generated by v. Then V = F[X]v = {f(X)-v | f(X) € F[X]}. Since period 
of v is (X — A)", it follows that f me ) - v is uniquely expressible as r(X) - v, where 
r(X) is the remainder obtained when f(X) is divided by (X — )". Further, every 
polynomial r(X) is uniquely expressible as polynomial s(X — ) in X — A which 
is of same degree as of r(X) (write r(X¥) = r(X — + A), and use the binomial 
theorem). This shows that every element of V is unique F-linear combination of 
{v, (X—A)-v, (X—A)*-v,..., (X—A)""!-v}. Also by the definition of F[X]-module 
(X—A)i-v = (T—AI)‘(v). Hence {v, (T—AD (v), (T—AD7(v), ..., T-AD"!(v)} 
is a basis of V. Further T((T — AJ)‘(v)) = (T — AD! (v) + ACT — d)i(v). This 
shows that the matrix of T with respect to the above ordered basis is the n x n matrix 
A given by 


6.3 Rational and Jordan Forms 217 


10000000 
1000000 
01200000 


000001A0 
OO0OO00001A 


Such a matrix is called a Jordan block of order n. We have established the following 
proposition. 


Proposition 6.3.6 Let T be a linear transformation on V with minimum polynomial 
(X — A)", and it is such that the corresponding F[X|-module V is cyclic. Then there 
is a basis of V with respect to which the matrix of T is a Jordan block as given 
above. o 


Example 6.3.7 Consider the nonidentity uni-upper triangular matrix A given by 


lag 
A=|017 
001 


Clearly, the characteristic polynomial #4(X) of A is (X — 1)>. Further, A —J 4 0, 
and 


00 ay 
(A—I1)? = | 00 0 
00 0 


Suppose that a 4 0 4 y. Then (A — 1)’ # 0. This means that m4(X) = d4(X) = 
(X — 1)?. Let 0 = [v1, v2, v3] be a nonzero vector in R?. Then (A—/)- 0! = 
[av2 + Bv3, yv3, O}', and (A — 1)? - 0’ = [ayv3, 0, 0]'. This shows that the period 
o(é3') of the column vector 23’ is (X — 1)?. Let us consider the R[X]-submodule of 
IR? generated by 23’. Since the set 


{@',A-@' =[G, 7, 1], A* -&° = [26 + 27, 27, 11} 


is a basis of R°, it follows that R? is a cyclic R[X]-module generated by @3’. Hence 
A is similar to the Jordan block of order 3 all of whose diagonal entries are 1. In 
particular, all such 3 x 3 matrices are similar. 


From the structure Theorem 6.2.20 of finitely generated p-module over a P.LD., 
and the Proposition 6.3.6, we have the following more general result. 


218 6 Canonical Forms, Jordan and Rational Forms 


Corollary 6.3.8 Let T be a linear transformation on a finite-dimensional vector 
y 

space V of dimension n over a field F such that the minimum polynomial mr(X) = 
(X — A). Then there exist integers my, > m3 > +++ > m, with m, > my, and a basis 


{v1, U2, 2+ +5 Umys Umi +1, Umy+2.°°* » Ump+ma» Umytmtls++e9 7 ' > Un, tnigetin sd 
of V with respect to which the matrix of T is 


A; 0000 0 
0 A.000 0 


0 0 OOOA, 
where A; is a Jordan block of order m; all of whose diagonal entries are . tt 


Example 6.3.9 Consider the nonidentity uni-upper triangular matrix A given by 


108 
A=|017 
001 


Clearly, the characteristic polynomial ¢,4(X) of A is (X — 1)?. Further, A—/ 4 0, 
and (A — J)? = 0. This means that m4(X) = (X — 1)*. Let = [v, vo, v3] bea 
nonzero vector in R?. Then (A—J)- 0’ = [Gv3, v3, OJ’. Since (G, y) 4 (0, 0), it 
follows that the period 0(@3") of @3' in the corresponding R[X ]-module Ris (X—1)?. 
Thus, the R[X]-submodule of R? generated by @3' is the subspace W of IR? generated 
by the set 


{@3',A- 2° = [G, 7, OI'}. 


Clearly, the dimension of W is 2. Consider the vector [u, v, 0]' such that Gu—uy 4 0. 
Then {@3", [3, 7, OJ’, [u, v, 0]'} is a basis of R>. Also the period of [u, v, 0]! is (X—1). 
The subspace U generated by [u, v, 0]' is a R[X] - submodule such that the module 
R? is the direct sum W @ U. Evidently, the matrix A is similar to the matrix 


A; 0 
0 Ao |’ 
where A, is a Jordan block of order 2 with diagonal entries 1, Az is the Jordan block of 
order | with diagonal entry 1, and zeros are the zero matrices of appropriate orders. If 


P is a matrix with first, second, and the third columns as @3", [3, y, 0]', and [w, v, 0]! 
respectively, then P~'AP is the matrix 


6.3 Rational and Jordan Forms 219 


A; 0 
0 Ao |* 
Consequently, all such matrices are similar. 


Definition 6.3.10 A matrix A is said to be in Jordan canonical form if there exist 
1, A2,..-, A, all distinct elements in F,, and integers 


My, M2, ~..-, Mp,, M141, M42, +--+, Mp, +--+, M, 
such that m, > m, for all t + 1 < k < 1 < t4, for all i, and Jordan Blocks 
A,,A2,...,A;,, Where Aj is mj; x mj Jordan block with diagonal entries ; for all 
Jit; +1 <j < t41 such that 


A; 0000 0 
0 A.000 0 
A= 6. ae 08 ode 
0 0 OOO0A, 
The following result is immediate consequence of the structure theorems (Theo- 
rems 6.2.19 and 6.2.20) of finitely generated torsion modules over a P.I.D. 


Theorem 6.3.11 Let T be a linear transformation on a vector space V over a field 
F such that all the characteristic roots of T are in F. Then there is a basis of V with 
respect to which the matrix of T is in Jordan canonical form. 


Proof Since all the characteristic roots of T are in F’, the minimum polynomial m7 (X) 
of T is given by 


my (X) = (K = Ay" (XK = Ag)" + (X= AW”, 


where Aj, A2,..., A, are distinct characteristic roots of T. Thus, the F[X] module V 
associated to T is a finitely generated torsion module of exponent mr (X). Using the 
structure theorem of finitely generated torsion module over a P.I.D., together with 
the above results, we find that there is a basis of V with respect to which the matrix 
of T is in Jordan canonical form. tt 


Corollary 6.3.12 Let T be a linear transformation on a finite-dimensional vector 
space V over an algebraically closed field F. Then there is a basis of V with respect 
to which the matrix of T is in Jordan canonical form. ft 


Since the matrices with respect to different bases are similar, we have the following 
corollary. 


Corollary 6.3.13 Every square matrix with entries in an algebraically closed field 
is similar to a matrix in Jordan canonical form. tt 


220 6 Canonical Forms, Jordan and Rational Forms 


Following corollary follows from the uniqueness theorem for the decomposition 
of finitely generated torsion modules over a P.I.D. as direct sum of cyclic p-modules 
for different primes. 


Corollary 6.3.14 Let A and B be two n x n matrices with entries in an algebraically 
closed field F. Then A and B are similar if and only if they are similar to matrices in 
Jordan canonical forms with same set of Jordan blocks. ft 


Corollary 6.3.15 A square matrix A is similar to a diagonal matrix if and only if its 
minimum polynomial has all its roots distinct. tt 


We illustrate the reduction of a matrix in to its Jordan canonical form by means 
of an example. 


Example 6.3.16 Consider the matrix 


122 
A= |011 
002 


of the Example 6.3.5. As already observed, in Example 6.3.5, that the R[x]-module 
IR? associated to the matrix A is the direct sum of the cyclic (x — 1)-submodule 
IR? x {0} with a generator (1, 1, 0) (isomorphic to the direct sum of R[x]/R[x](x — 
1)), and the cyclic (x — 2) submodule {(4a,a,a@) | a@ € R} (Gsomorphic to 
R[x]/R[x](« — 2)). As such the representation of the matrix relative to the basis 
{(1, 1,0)’, (A — Dd, 1, 0)’, (4, 1, 1)'} is the Jordan canonical form 


100 
110 
002 


of the matrix A. The matrix P of transformation is the matrix with columns 
{C, 1, 0)’, (A — Dd, 1, 0)’, (4, 1, 1)’}. 


Recall that a linear transformation is said to be a semi-simple linear transforma- 
tion, or it is said to be a diagonalizable linear transformation if its matrix represen- 
tation with respect to certain basis is diagonal. T is said to be nilpotent if 7” = 0 
for some n. A square matrix A is said to be a semi-simple matrix, or it is said to be a 
diagonalizable matrix if it is similar to a diagonal matrix. It is said to be nilpotent if 
A” = 0 for some n. 


Theorem 6.3.17 (Jordan—Chevalley) Let T be a linear transformation on a finite- 
dimensional vector space V over an algebraically closed field F (or at least all 
characteristic roots of T are in F ). Then T can be expressed uniquelyasT = T;+ Th, 
where T, is semi-simple, T,, is nilpotent, and T; and T,, commute. Further, there are 
polynomials g(X) and h(X) without constant terms such that g(T) = Ts and 
AT) = T,. 


6.3 Rational and Jordan Forms 221 


Proof Suppose that mr(X) = [];_,(X — Ai)", where 1, A2, ..., A, are the distinct 
eigenvalues of T. Since every finitely generated torsion module over a P.I.D. is 
direct sum of p - submodules for different primes p dividing the exponent of the 
module, we have V = V,; @ Vo ®--- @ V,, where V is F[X]-module associated 
to the linear transformation T, and V; is the (X — \;)-submodule of V. Clearly, 
V; = Ker(T — 4;I)’". Let T, be the linear transformation defined on V by the 
requirement that T,(x) = jx for all x € V;, 1 <i<vr. Then 7, is clearly a semi- 
simple linear transformation. Take 7, = T—T,. Then T,, is nilpotent, for the matrix of 
T, relative to the basis of V obtained by taking the union of bases of V; is strictly lower 
triangular. Thus, 7 = 7, + T,,. We show that 7; and T,, have the required property. 
Since Aj, Ao, ..., A, are all distinct, the set {(X — A,)’"", (X —A2)™, ..., (XK —A,)""} 
is a set of pairwise co-prime elements of FX]. By the Chinese remainder theorem, 
there exists a polynomial g(X) such that g(X) = A;(mod(X — 4;)"") for all 7, and also 
g(X) = O0(modX). Then it is clear that 7, = g(T), andif we take h(X) = X—g(X), 


then T, = T—T,; = A(T). Since any two polynomial in T will commute with 
each other, it follows that T, and 7,, commute with each other. Next, suppose that 
T = T, + Tp» is another such decomposition. Then 7; — 7, = T2 — T,. Since 


T;, T,, and also Ty, T, commute, it follows that T, — T; is semi-simple as well as 
nilpotent. But, then 7, — 7; = 0 (note that a diagonal matrix is nilpotent if and only 
if itis 0). Hence 7, = T,, andsoalsoT, = 7). tt 


Definition 6.3.18 The linear transformation 7, in the above theorem is called the 
semi - simple part of 7, and T,, is called the nilpotent part of T. 


Corollary 6.3.19 (Jordan—Chevalley) Let A be a square matrix with entries in an 
algebraically closed field (or at least all the characteristic roots of A are in F ).z Then 
A can be expressed uniquely as A = As + Ay, where Ag is diagonalizable, Ay is 
nilpotent, and A, and A, commute. Further, there exist polynomials g(X) and h(X) 
without constant terms such that A; = g(A), and A, = h(A). tt 


Corollary 6.3.20 Let T be a linear transformation on a finite-dimensional vector 
space V over an algebraically closed field F. Then a linear transformation S on V 
commutes with T if and only if it commutes with its semi-simple and nilpotent parts. 


Proof Clearly if S commutes with 7, and T,,, then it commutes with T = 7, + T,. 
Conversely, if S commutes with T, then it commutes with f(T) for all polynomials 
f(X), and since 7, and 7, are polynomials in 7, it commutes with 7, as well as with 
T),. tt 


Recall that a linear transformation T is unipotent if all of its all characteristic roots 
are |. 


Corollary 6.3.21 (Multiplicative Jordan—Chevalley theorem) Let T be a nonsin- 
gular linear transformation on a finite-dimensional vector space V over an alge- 
braically closed field F (or at least all characteristic roots of T are in F). Then T 
is uniquely expressible as T = T;T,, where T, is semi-simple, T, is unipotent, and 
T,T, = T,T;. Further, T, and T,, are polynomials in T. 


222 6 Canonical Forms, Jordan and Rational Forms 


Proof We know that T is uniquely expressible as T = T, + T,, where T, is 
semi-simple, 7, is nilpotent, 7;7,, = T,7;, and also T;, T,, are polynomials in T. 
Since T is nonsingular, 7, is also nonsingular. Set T, = I + Ty 'T,. Since T, 
and T, commute, and T,, is nilpotent, it follows that T, 'T, is nilpotent. Hence T,, is 
unipotent. Clearly T = T;7,,. The rest follows from the properties of T; and T,,. 


Application to Differential Equations 


Consider the first-order linear differential equation 


dx 
— = aX. 
dt 
The general solution to the above differential equation is x(t) = ce“, where c is an 
arbitrary constant. 

In complete analogy to the above differential equation, we discuss the solution 
to a system of n-homogeneous first-order linear differential equations with constant 
coefficients. In the matrix form, this system of equations can be expressed as 


dX(t) 


= A-X(p), 
a (t) 


where X(f) is a smooth column vector point function (smooth function from R to 
RR") and A an x nreal matrix. 

First, let us introduce e4. Identify M,,(R) with the Euclidean space R” with 
Euclidean metric. Consider the sequence {T7,,} of functions from the metric space 
M,,(R) to itself defined by 


AZ A” 

2! m! 
It can be seen easily that the above sequence is uniformly convergent on any compact 
subset of M,,(IR). Let us denote e* by 


Limm—coTm(A). 


This defines a map exp from M,,(IR) to M,,(R) by exp(A) = e’. Using elementary 
analysis, we observe that exp is continuous, in fact, differentiable, and its Jacobian 
at 0 is the identity matrix of order n. By the inverse function theorem, it follows 
that exp is local diffeomorphism. Again using the Abel’s result, we can show that 
eAt+B — ¢4. e® provided that AB = BA. In particular, it follows that 


e4.A = eA4 = YT = ee 


6.3 Rational and Jordan Forms 223 


Hence e* is always nonsingular. Thus, exp is a local diffeomorphism from M,,(R) to 
GL(n, R). The map t ~~ e“ is a group homomorphism from (R, +) to GL(n, R) for 
all A € M,(R). These are called one-parameter family of subgroups of GL(n, R). 


Theorem 6.3.22 The columns of the matrix e“ form a basis of the space of solutions 


of the system of homogeneous linear differential equations expressed in matrix form 
by 


dX (t) _ AX) 


where X(t) is a smooth column vector point function. 


Proof Using the theorem on term by term differentiation of a uniformly convergent 
series, it follows that 


de“ 
pe eS Are tA 
dt 
Since Alea) = [by(O], where b(t) = Sul , each column Y;(t) of e” satisfies 
dYj(t = 
HO = To 


Thus, each column of e is a solution of the given system of differential equations. 
Conversely, suppose that X(t) is a solution. Then 


de~*X(t) 
dt > 


=—Ae @X@) +2 


dX (t — = 
ee = —Ae aX) + 2M Ar. 


—tA 


Since A and e~™ commute, it follows that 


de“X(t 
OEE way 
dt 
Hence e~“X(t) = C, where C is a constant column vector. It follows that X(t) = 


e . C, and so every solution of the system of equations is a linear combination of 
tA 3 


the columns of e“. Since e is nonsingular the columns are linearly independent. 4 
Thus, the problem is to compute e”. Let us observe the following: 
(i) fA = Diag(\,, rx, ..., An), thene4 = Diag(e, e”,...,e). 
(ii) If 


224 6 Canonical Forms, Jordan and Rational Forms 


Orod----O 
00r0d---0 
A= ; 
000.---Of 
00. 0 
then 
Leh ca 
Olt a+: oom 
a= 
000 .- 1 t 
000 .- 0 1 
(iii) If 
_ | By 0 
a=[Sn) 
then 


(iv) If A = CBC™!,thene* = Ce®CH!. 

(v) If A and B commute, then e+? = &4.e?. 

(vi) If A is areal n x n matrix, and X(t) a complex-valued solution of the system 
of differential equations 


dX (t) 
dt 


= A-X(p), 


then the real and imaginary parts of X(t) are also solutions of the above system of 
equations. 

Using the Jordan—Chevalley decomposition A = A;-+A,, where A; is similar to 
a diagonal matrix, and A,, is similar to direct sum of the matrices of the form 


6.3 Rational and Jordan Forms 225 


010--0 
0010-0 
00--0O1 
00 - 0 
Further, A, and A, commute, and so e4 = e*: - e4”, Using the above observations, 


we can compute e, and thereby get the general solution of the given homogeneous 
system of linear differential equations. We illustrate the above discussion by means 
of an example. 


Example 6.3.23 Consider the system of differential equations given in the matrix 
form by 


dX(t) = 
= AX(t), 
dt 
where 
110 
A=]011 
001 
Then A, = J, and 
010 
An = | 001 
000 
Since tA = tA, + fAy, it follows that (tA); = tAs, and (tA), = tA,. Again, since 
tA, and tA, commute, e = e's. e’4", Clearly, es = e'I, and as discussed above 
2 
1th 
em = 1011 
001 
Thus, 


2 
e' te’ Fe 

1A : 
e = 1/0 e' te 


00 ée 


The columns of the above matrix form a basis of the space of solutions. 


226 6 Canonical Forms, Jordan and Rational Forms 


Exercises 


6.3.1 Consider the following linear transformations on R* whose matrix represen- 
tations with respect to the standard ordered basis are given by 


(i) 
200 
020], 
003 
(ii) 
100 
110], 
003 
(iii) 
0 11 
-100], 
100 
(iv) 
20 1 
QO 2 —1 | ,and 
1-1 2 
(v) 
1 3 3 
3. 1. 3 
—3 -—3 -5 


In each case, find the minimum polynomial, the decomposition of the corresponding 
IR[X]-module R? as direct sum of cyclic modules, also a basis of R* with respect to 
which the matrix representation is in rational canonical forms. 


6.3.2 Reduce the matrices in the Exercise 6.3.1 to rational canonical form. 
6.3.3, Determine the pairs of matrices in Exercise 6.3.1 which are similar. 


6.3.4 Reduce the following two matrices over Zs into rational canonical forms, and 
determine if they are similar to each other. 


6.3 Rational and Jordan Forms 227 


(i) 


SolNI el 
BIL Ol WI 
a al | 


(ii) 


el Ol rl 
Solel NI 
| LW] O| 


6.3.5 Reduce the matrices in Exercise 6.3.1 in Jordan canonical form considering 
them as matrices over the field C of complex numbers. Determine which pairs are 
similar over C. 


6.3.6 Reduce the following complex matrices into Jordan canonical form, and also, 
in each case, find a nonsingular matrix P such that PAP™! is in Jordan canonical 
form. Determine which pair of matrices are similar to each other. 


(i) 


il Oo 
02: -1 
1014/7 


(ii) 


(iii) 


Oli 
iil 
001 


6.3.7 Show that a linear transformation S commutes with T if and only it commutes 
with 7, as well as with T,,. 


6.3.8 Let 7; and T> be linear transformations on a vector space V of dimension 
3 over a field. Show that the module V over F[X] associated to 7; is isomorphic 
to the F[X]-module V associated to Tz if and only if they have same characteristic 
polynomials and minimum polynomials. Deduce that any two 3 x 3 matrix over F 
are similar if and only if they have same characteristic and minimum polynomials. 
Is this result true for 4 x 4 matrices? Support. 


228 6 Canonical Forms, Jordan and Rational Forms 


6.3.9 Let A be a complex matrix all of whose characteristic roots are real. Show that 
A is similar to a real matrix in Jordan form. 


6.3.10 Let A be an x n real matrix such that A2 + J = 0. Show thatn = 2r is 
even. Show also that A is similar to 


On Ih 
TI, On 
6.3.11 Let T be a nilpotent transformation on a complex finite dimensional vector 
space V. Let f(X) = ag + aX + d)X* + ---+ a,X". Find the semi-simple part 
of f(T). 
6.3.12 Find e4 for 
102 


A= ]010], 
202 


and also the solution of the system of linear equations given in matrix form by 


dX (t) = 
= AX(t). 
dt 
6.3.13 Reducing the matrix 
110 
A= 1|010 
011 


in to Jordan canonical form, find e“, and then solve the corresponding system of 
differential equations. 


6.3.14 If \ is an eigenvalue of A, then show that e* is an eigenvalue of e*. Deduce 
from this fact that e“ is nonsingular. 


6.3.15 Show that Der(e*) = e!”, 


6.3.16 Show that the map exp induces a map from the space s/(n, R) of n xn matrices 
with trace 0 to the group SL(n, R) of matrices of determinant 1. Show further that it 
is a local diffeomorphism. Determine the dimension of the group SL(n, R). 


6.3.17 Show that the map exp induces local diffeomorphism from the space SS, (IR) 
of skew symmetric matrices to the group O(n) of orthogonal matrices. Determine 
the dimension of O(n). 


6.3.18 Is exp surjective from M,,(R) to GL(n, R)? Support. 


6.3.19 Give an example to show that e“¥ need not be e4 - e?. 


Chapter 7 
General Linear Algebra 


The present chapter is devoted to the study of Noetherian rings, Projective modules, 
Injective Modules, Tensor product of modules, Grothendieck, and Whitehead groups 
of rings. 


7.1 Noetherian Rings and Modules 


Over an arbitrary ring, we note that left and right modules are in general distinct. 
Recall, further, that all subspaces of a finitely generated vector space over a field 
with a given choice of basis determines and is uniquely determined by a matrix 
with entries in F’. This is a consequence of the fact that every subspace of F” is 
finitely generated. However, this fact is not true in general rings. Rings over which 
all submodules of left module R” can be described by matrices are essentially left 
noetherian rings, which we describe in this section. The theory of right noetherian 
modules and right noetherian rings can be developed exactly on the same lines. A 
module will always mean a left module, unless stated otherwise. 

A module M over R is said to satisfy ascending chain condition (A.C.C), if 
given any chain 

Ni SN S--+ ON, SN Sees 


of submodules of M, there exists no € N such that N, = N,, forall r > no. 
A module M is said to satisfy maximal condition, if given any nonempty family 
{M, | a € A} of submodules of M, it has a maximal member. 


Theorem 7.1.1 Let M be a module over R. Then the following conditions are 
equivalent. 


© Springer Nature Singapore Pte Ltd. 2017 229 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_7 


230 7 General Linear Algebra 


1. M satisfies A.C.C. 
2. Every submodule of M is finitely generated. 
3. M satisfies maximal condition. 


Proof 1 => 3. Let X = {M, | a € A} be a nonempty family of submodules 
of M. Suppose that X has no maximal member. Let My, € X. Since My, is not a 
maximal member of the family, there is a member M,, € X such that My, C Ma). 
Again, since M,, is not a maximal member, there is a member M,, € X such that 
M,, C Mg,. Using induction, we arrive at a properly ascending chain of submodules 
of M. This is a contradiction to | (note that we have used axiom of choice in some 
form). 

3 => 2. Assume 3. Let L be a submodule of M. Let X be the family of all finitely 
generated submodules of L. Clearly, {0} € X, and so X is a nonempty family. From 
3, it has a maximal member Lo (say). We claim that Lg = L. Suppose not. Then 
there is x € L—Lpo. But, then Lo+ < x > is also finitely generated, and so it belongs 
to X. This is a contradiction to the choice of Lo. Thus, L is finitely generated. 

2 => 1. Assume 2. Let 


M,OM)C:---OM,OM,4,¢-:- 


be an ascending chain of submodules of M. Then L = es M, is asubmodule. By 


2, L is finitely generated. Suppose that L = < {x,, x2,...,X,} >. Then x; € M,, 
for some r;. Let ro be the maximum of all 7;. Then x; € M,, for each 7. It follows 
thatL = M,,,andso M, = M,, forallr = ro. tt 


A module M over R is said to be noetherian module if it satisfies any one, 
and hence all of the conditions in the above theorem. A ring R is said to be a 
noetherian ring if it is a noetherian module over itself. 

If we consider a ring R as a left module over itself, then submodules are precisely 
the left ideals. Thus, a ring R is a left noetherian ring if and only if all its left ideals 
are finitely generated. 

Since submodule of a submodule is again a submodule of the module, we have 


Proposition 7.1.2 Every submodule of a noetherian module is a noetherian 
module. t 


Proposition 7.1.3. Any homomorphic image of a noetherian module is noetherian. 


Proof Let f : M, ~ Mp2 be a surjective homomorphism, where M, a noetherian 
module. Let L be a submodule of M2. Then f —!(L) is a submodule of M,. Since 
M, is noetherian, f—'(L) is finitely generated. Since image of a finitely generated 
module under a homomorphism is finitely generated, L = f(f~'(L)) (f being 
surjective f(f—~'(L)) = L) is finitely generated. Thus, every submodule of Mp is 
finitely generated. Hence M) is a noetherian module. ft 


The argument used in the proof of the above proposition is valid for rings(inverse 
image of an ideal under a homomorphism of rings is an ideal), and so we have the 
following proposition. 


7.1 Noetherian Rings and Modules 231 


Proposition 7.1.4 Any homomorphic image of a noetherian ring is noetherian.  ¢ 


Corollary 7.1.5 Quotient of a noetherian module (ring) is a noetherian mod- 
ule(ring). tt 


Proposition 7.1.6 Let M be a module over a ring R. Let L be a submodule of M 
such that L and M/L are noetherian. Then M is noetherian. 


Proof Let U be a submodule of M. Then, U + L/L, being a submodule of M/Z, is 
finitely generated. By the second isomorphism theorem U/U () L is isomorphic to 
U+L/L. Hence U/U () L is finitely generated. Further, U () L, being submodule 
of anortherian module L, is noetherian. Hence U () L is finitely generated. We know 
that if N and M/N are finitely generated, then M is also finitely generated. Thus, U 
is finitely generated. This shows that M is noetherian. tt 


Proposition 7.1.7 Let M,, M,..., M, be modules over R. Then M = M, x Mz x 
--+ x M, is noetherian if and only if each M; is noetherian. 


Proof For each i, the projection p; is a surjective homomorphism from M to M;. 
Sine homomorphic image of a noetherian module is noetherian, if MW is noetherian, 
then each M,; is noetherian. Conversely, suppose that each M; is noetherian. We have 
to show that M is noetherian. By the induction, it is sufficient to prove the result for 
r = 2. Suppose that M,; and M2 are noetherian. The projection p2 of M onto M2 isa 
surjective homomorphism whose kernel is M, x {0}. By the fundamental theorem of 
homomorphism M/M, x {0} is isomorphic to M2. Thus, M/M, x {0} is noetherian. 
Also M, x {0} is isomorphic to M, (the map x ~~» (x, 0) is an isomorphism), and so 
it is noetherian. From the previous proposition M is noetherian. tt 


Theorem 7.1.8 Let R be a noetherian ring. Then every finitely generated module 
over R is a noetherian module. 


Proof Let R be a noetherian ring, and M a finitely generated module over R which 
is generated by S$ = {x,,X2,...,x,}. Define a map 7 from R’ to M by 


1 (0l1, 02, 2.6, Mp) = OX, + 2X2 + +++ + A;-X,. 


Clearly, 1 is a surjective homomorphism. Since R is noetherian, from the previous 
result, R” is also noetherian. Since homomorphic image of a noetherian module is 
noetherian, M is noetherian. tt 


Remark 7.1.9 It follows that R is noetherian if and only if every finitely generated 
module over R is noetherian. 


Example 7.1.10 Every P.I.D. is a noetherian ring because every ideal is generated 
by a singleton. Thus, every finitely generated module over a P.I.D is noetherian. In 
particular, every submodule of a finitely generated module over a P.I.D is finitely 
generated. Note that this is not true over an arbitrary ring (give an example). In turn, 
every subgroup of a finitely generated abelian group (Z-module) is finitely generated. 


232 7 General Linear Algebra 


Example 7.1.11 The polynomial ring Z[X,, X2,..., Xn...] over Z in countably 
infinite set of indeterminates {X,, X2,..., Xn,...} iS not a noetherian ring, for the 
ideal generated by {X,, X2,..., X,,...} is not finitely generated. This also shows 
that, in general, submodule of a finitely generated module need not be finitely gen- 
erated. Check if it is a U.ED. 


Example 7.1.12 Subring of noetherian ring need not be a noetherian ring: The ring 
ZX, X2,...,Xn..-] is an integral domain which is not noetherian. However, its 
field of fractions is noetherian. 


Example 7.1.13 Let R be noetherian integral domain. Then every nonzero nonunit 
element of R can be written as finite product of irreducible elements of R. To prove 
this, it is sufficient to show that there is no infinite chain a), d2,...,d,,... Such that 
Gn+1 18 proper divisor of a, for all n, or equivalently, there is no infinite properly 
ascending chain of principal ideals. This is true, for R is a noetherian ring. 


Theorem 7.1.14 (Hilbert Basis Theorem). Let R be a commutative ring with iden- 
tity. Then the polynomial ring R[X] is noetherian if and only if R is noetherian. 


Proof Suppose that R[X] is noetherian. Define a map n from R[X] to R by 
n(f(X)) = f(O) (the constant term of f(X)). Then n is a surjective homomor- 
phism. Since homomorphic image of a noetherian ring is a noetherian ring, R is 
noetherian. 

Conversely, suppose that R is noetherian. Then, we have to show that R[X] is 
noetherian. Let A be an ideal of R[X]. We show that A is finitely generated. Let 
n € NU{0}. Define A, by 


An = {a€ R| there exists f(X) = dg tayX +--+ +ay_1X" | aX" © A}. 


Clearly, 0 € A,, forO = 0+ 0X +4.---0X" € A. Let a,b € A,. Then 
there exist f(X) = agtaX+-e--4+ dn X"—! + aX", and g(X) = 
bo tb X +--+ +b,_,X""!|+bX" € A. Since A is an ideal, f (X) — g(X) € A, and 
also af (X) € A. Hence a —b € Ay, and also waa € A,. This shows that A, is an 
ideal of R. Further, let a € A,, and f(X) € A be such that aX” is the leading term 
of f(X). Since Xf(X) € A, a € Ani. Thus, A, C A,+1 for all n, and we have an 
ascending chain 
Ap SA, SADC -+- CALC: 


of ideals of R. Since R is noetherian, there exists m € N such that A, = 
Am for all n > m. Again, since R is noetherian, A, is finitely generated ideal 


of R for all n. Let (aj, aj2,..., Gin,} be a set of generators of the ideal A;. Let 
Sij(X),0 <i <m,1< j <n; bea polynomial in A whose leading term is ajjX'. 
We show that S = {fjj(X) |0<i<m,1 < j < nj} isa set of generators of 


the ideal A of R[X]. Let f(X) € A. We show that f(X) is linear combination of 
members of S with coefficient in R[X]. The proof is by the induction on degree of 
F(X) (clearly 0 is linear combination of members of S). If degree of f (X) is 0, then 


7.1 Noetherian Rings and Modules 233 


Ff (X) is constant, and so it belongs to Ag. But, then it is a linear combination of 
{a01, 02, -- +, Gono} with coefficients in R. Clearly, ag; = fj, and so in this case 
Ff (X) is a linear combination of members of S with coefficients in R C R[X]. Thus, 
the result is true if the degree of f(X) is 0. Assume that the result is true for all those 
polynomials in A whose degree is less than r. Let f(X) € A, and degree f(X) is r. 
There are two cases: 


Gi) r>m. 
Gi) r<m-—1. 


Consider the case (i). Let f(X) = ao + a,X +--+ +a,X" be a member of 
A. Then a, € A, = A,. Since A,, is generated by {Am1,Qm2,---;4nm}, G = 
Om1Am1 + Um24m2 + +++ + Amn, Imny, for some Am1, Am2,+-++5 Amn, in R. Then, 


Si(X) = S(X) _ Xx (mt fmt (X) + m2 fina (X) + +++ + inn, finn, (X)) 


is amember of A, and it is 0, or it is of degree less than r. By the induction hypothesis 
F(X) is a linear combination of members of S$ with coefficients in R[X]. In turn, 
F(X) is also a linear combination of members of S with coefficients in R[X]. 

Consider the case (ii). In this case r < m — 1, and so f,;(X), 1 < j <n, are in 
S. Now, 


A(X) = F(X) ra a1 fri (X) — a2 fro(X) Set Orn, rn, (X) 


belongs to A, and it is 0, or it is of degree less than r. Again, by the induction 
hypothesis, /|(X) is a linear combination of members of S. Hence, f(X) is also a 
linear combination of members of § with coefficients in R[X]. tt 


Using the induction on n, we get the following corollary. 


Corollary 7.1.15 Jf R is noetherian ring, then R[X,, X\,...,Xn] is also 
noetherian. tt 


Remark 7.1.16 Although in a noetherian domain every nonzero nonunit element 
is product of irreducible elements, it need not be a U.F.D. For example, consider 
Z[./—5]. This is not a U.KD. Now, Z[X] is a noetherian ring (by the Hilbert 
basis theorem), and the map f(X) ~» f(s/—5) is a surjective homomorphism of 
rings(verify). Since homomorphic image of a noetherian ring is noetherian Z[/—5] 
is a noetherian ring. Also observe that a U.F.D. need not be a noetherian ring. For 
example Z[X1, X2,..., Xn,---]is a U.ED. but it is not noetherian. 


Exercises 
7.1.1 Show that a noetherian domain is a U.E.D. if and only if g.c.d. exists in R. 


7.1.2 Show that every proper ideal of a noetherian ring can be embedded in a max- 
imal ideal. 


234 7 General Linear Algebra 


7.1.3, Give an example of an integral domain in which every nonzero nonunit element 
is expressible as product of irreducible elements but still it is not a noetherian ring. 


7.1.4 Is a direct product of noetherian rings a noetherian ring? Support. 
7.1.5 Show that a vector space is noetherian if and only if it is finite dimensional. 
7.1.6 Show that Z[,/n] is a noetherian ring for all integers n. 


7.1.7 Show that an abelian group is a noetherian Z-module if and only if it is finitely 
generated. 


7.1.8 Give an example of a noetherian module which does not satisfy D.C.C. for 
submodules. 


7.1.9 Let G be a finite commutative semigroup with identity. Show that the semi- 
group ring R(G) is noetherian if and only if R is noetherian. 


7.1.10 Suppose that R is noetherian, and G is group which is also noetherian in 
the sense that every subgroup of G is finitely generated. Is R(G) also noetherian? 
Support. 


7.1.11 Show that if R is noetherian, then the ring R[[X]] of formal power series is 
also noetherian. 

Hint. Imitate the proof of the Hilbert basis theorem by taking order function on the 
power series instead of degree of a polynomial. 


7.2 Free, Projective, and Injective Modules 


In the last section, we described rings over which all modules possesses one of the 
most important and crucial property of vector spaces (modules over fields), viz., all 
submodules of finitely generated modules are finitely generated. Following are other 
two important properties of vector spaces: (i). Given a vector space W over a field 
F,, and a surjective homomorphism f from a vector space V over F to W, there is 
a vector space homomorphism ¢ from W to V such that tof = Jw. (ii). Given an 
injective homomorphism i from W to V, there is a homomorphism s from V to W 
such that soi = Iw. In this section, we discuss modules over arbitrary rings with 
these important crucial properties. Later in Chap. 9 on representation theory of finite 
groups, we shall describe rings (semi-simple rings) over which all modules have both 
of these crucial properties. 

Let R be a ring(not necessarily commutative) with identity, and X be a set. We 
have the following universal problem: 

“Does there exists a pair (M, 7), where M is aleft R-module, i amap from X to M, 
with the property that given any such pair (NV, /), there is a unique R-homomorphism 
¢ from M to N such that doi = j?” 


7.2 Free, Projective, and Injective Modules 235 


As in case of free groups, the solution to this problem is unique up to isomor- 
phism. More precisely, if (/, 7) and (N, j) are solutions to the above problem, then 
there exists an isomorphism ¢@ from M to WN such that the following diagram is 
commutative. (imitate the proof in the case of free groups). 


rere | 


We show the existence of solution to the above problem. If X is finite set 
{x1, X2,...,X,} containing n elements. Then the pair (R",i), where i(xj) = @;, 
is a solution to the problem. We do the construction of solution to the above prob- 
lem for an arbitrary set X. Let F(X) denote the set of all maps from X to R which 
vanish at all but finitely many points of X. More precisely, F(X) = {f : X — 
R | thereis a finite subset J such that f(x) = 0 forall x € X — J}. Let 
ft, g € F(X). Define a map f + g from X to R by (f+ g)(*) = f(x) + g(X). 
Observe that f + g € F(X). This defines a binary operation + on F(X) such that 
(F(X), +) is an abelian group. Define a multiplication - on F'(X) by elements of R 
by (a: f)(x) = a- f(x), where - in the R.HLS. is the product in the ring. It is easy to 
see that F(X) is a left R-module. Define a map i from X to F(X) byi(x)(y) = lif 
x = y, and 0 otherwise. It is clear that i is an injective map. We show that (F(X), i) 


is a solution to the above problem. Let f € F(X) — {O}. Let {x1, x2,...,x,} the 
finite subset of X such that f(x;) = a; A 0,and f(x) = Oifx A x;. Then it 
is clear that f = aji(x,) + aoi(x2) +--+ + ayni(x,), and such a representation is 


unique. More precisely, i(X) is a basis for F(X). Let N be a left R-module, and j 
be a map from X to N. Define a map ¢@ from F(X) to N by 


(aii (x1) + dai (x2) +++ + GniQXn)) = arf (%1) + arf (%2) + +++ + nj (Xn). 


Then ¢ is a homomorphism such that goi = j. Since i(X) is a basis, such a map 
is unique. This completes the proof of the existence of the solution to the above 
problem. 


Definition 7.2.1 The solution to the above universal problem is called the free left 
R-module on X. Thus, (F(X), i) is the free left R-module on X. 


Proposition 7.2.2 Every left R-module is quotient of a free left R-module. 


Proof Let M be a left R-module, and (F'(M), i) the free left R-module on M. The 
identity map Jy is a map from the set M to the left R-module M. From the universal 


236 7 General Linear Algebra 


property of free left R-module, there is a unique homomorphism ¢ from F(M) to 
M such that doi = Ty. This shows that @ is surjective homomorphism. By the 
fundamental theorem of homomorphism M is isomorphic to F(M)/kerd. ft 


In Chap. 6 Sect. 6.1, we defined direct sum of finitely many R-modules. Now, we 
define direct sum of an arbitrary family of submodules. Let {My | a € A} bea family 
of R-modules. Then the Cartesian product 


The. Ma = {x: A — Ue. M, | x(a) € M, for all a} 


is aleft R-module with respect to the operations defined by (x+y)(@) = x(a)+y(q@) 
and (a-x)(a@) = a-x(a). Define a map i, from M, to [ee M, by ig(x)(B) = x 
if 8 = a, and 0 otherwise. It is clear that i, is an injective homomorphism. Further, 
the a” projection py from Tle a Ma to Mg defined by pa(x) = x(q) is a surjective 
homomorphism such that iyopy = Iy,. The submodule of [],-, Ma generated by 
User ta(Ma) is clearly 


ach 


{x € ee, M, | x(a) = O except for finitely many a}. 


This submodule is denoted by ®Xyc, Mg, and it is called the external direct sum 
of the family {M, | a € A}. If M®% denotes the submodule of ®X ye, My generated 


by Usste ig(Mg), then ig(Ma)(] M* = {0}. 
Proposition 7.2.3. Let M be a module over a ring R. Let {M, | a € A} be a family 
of its submodules. Then the following conditions are equivalent. 
I. (i) M is generated by ,<, Ma- 
(ii) My) M* = {0}, where M® is the submodule of M generated by Usee Mz. 
2. For every nonzero element x € M, there is a unique finite subset {a,, a2, ..., a;} 
of distinct elements of A together with unique nonzero elements Xy, € Mo, for 
eachi, 1 <i <r such that 


X = Xa, + Xa, +++ + Xa,- 
Proof (1 = 2) Assume 1. Let x be a nonzero element of M. From 1(i), there exist 
a finite subset (a1, a@2,...,a@,} of A together with nonzero elements xy, € Mo, for 
each i, 1 <i <,r such that 
X = Xa, + Xa, +++ + Xa,- 
Next, we prove the uniqueness. Suppose that 
X = Xa + Xa ttt + Xa, = Ye, + YB + -e* + YB, 


where {a@1, @2,...,a@,} and {6), Bo,..., Bs} are sets of distinct elements of A, xg, € 
M,, — {0} forall i, 1 <i <r, and ye, © Mg, — {0} for all j, 1 < j <s. We need 


7.2 Free, Projective, and Injective Modules 237 


to show the following: (i) r = s, (ii) after some rearrangement a; = ; for all i, 
and (111) xy; = Yq, for alli. We prove it by the induction on max(r, s). Suppose that 
max(r,s) = 1.Clearly,r = 1 = s. Ifa, 4 6), thenx € My, (\M% = {0}. This 
means that x = 0,acontradiction. Hencea; = 6,,andthenx = xg, = yg,. Thus 
the result is true for max(r,s) = 1. Assume that result is true for max(r,s) = n. 
Let x be a nonzero element having representations 


X= Xa TF Xa Poe + Xen = YB a YBa a Pts YBm? 


where {@, @2,...,Q@y,+41} and {fj}, Bo,..., Bm} are sets of distinct elements of A, 
m <n+1,Xo, € My, —{0} foralli,1 <i <n+l,and yg, € Mg,—{O} forall j,1< 
Jj <m. We show that a, = £; for some j. Suppose not. Then 

Xo = Mag TT Xan FYB, + Vp, tot + YB, 
belongs to My, ()M™ = {0}. Hence xg, = 0. This is a contradiction to the 
supposition that x —a, ~ 0. Thus,a, = £; for some j. After rearranging, we may 
assume that a, = #1. Further, 

Xe, — Yor = Xap 7 tT Kang TF Vp ott HY By 


Hence xy, — Ye, belongs to My, (| M“' = {0}. This shows that x., = yo,. In turn, 
Xa, + Xa3 Foe HF Xana = YB, + Yes Hott HF YB 


By the induction hypothesis,n +1 = m,a; = fj,and xy, = yg, for alli. 

(2 = 1) Assume 2. Evidently, 1(1) follows. From the uniqueness of the repre- 
sentation of a nonzero element in M, it follows that M, (| M®% cannot contain any 
nonzero element of M. Thus, M, () M* = {0}. tt 


Definition 7.2.4 We say that a module M over a ring R is an internal direct sum 
of the family {M, | a € A} of its submodules if it satisfies any one, and hence both 
of the conditions in the above proposition is satisfied. 


Proposition 7.2.5 Let M be an internal direct sum of the family {M, | a € A} of 
its submodules. Then M is isomorphic to the external direct sum ® Xen My. 


Proof The map 7 from the external direct sum @Xy<, My to M defined by n(x) = 
XweAX (a) is easily seen to be an isomorphism. tt 


From now onward, we shall not distinguish the internal and the external direct 
sums. 

It follows from the construction of a free R- module F(X) on a set X that F(X) 
is isomorphic to the direct sum ® Yee xMa, where M, = R for all a. Thus, F(X) 
is precisely the direct sum of X copies of R. 


238 7 General Linear Algebra 


Consider a chain 


An+1 n 
May My —-Mn-1 > 


where M,, is an R-module for all n, and a, is a homomorphism for all n. This 
chain is called an exact sequence at M, if kera, = imageady+,. It is said to be 
exact sequence, if it is exact at all M,,. 

An exact sequence 


p 


0 5. My». Me — eM =e 


where 0 is the trivial module, is called a short exact sequence. Clearly, the above 
sequence is a short exact sequence if and only if (i) @ is injective, (ii) 6 is surjective, 
and (iii) kerB = imagea. 

If N is asubmodule of a module M, then 


a Vv 
O_, N _,.M__, M/N __,0 


is a short exact sequence, where j is the inclusion map, and v is the quotient map. 

The sequence 0 —> M; —> Mj is exact if and only if M; —~> Mz is injective. 
The sequence Mz —> M3 —~ O is exact if and only if Mz —~ M3 is surjective, 
and the sequence 0 —> M,; —> M, —> Ois exact if and only if M; —> Mp) is an 
isomorphism. 


Theorem 7.2.6 (Five Lemma) Consider the following commutative diagram where 
rows are exact, and vertical maps are homomorphisms. 


a2 a3 a4 
M —_—> M2 —> Ms ——_> My ——_> Ms 


| fh | hr | fs | hi | fs 
fon Bo Bs Ba 


Ny — 5 No ——» 3 —— 4 ——> Ns 


(i) If fi is surjective, fy and f4 are injective, then f3 is injective. 
(ii) If fs is injective, f, and fy are surjective, then f3 is surjective. 
(iii) If fi, fo, f4, fs are isomorphisms, then f3 is also an isomorphism. 


Proof (i). Suppose that f; is surjective, f, and f, are injective. We have to show 
that f3 is injective. Suppose that f3(m) = 0. Then f4(a3(m)) = 63(f3(m)) 
(commutativity of the diagram) = £3(0) = _ 0. Since f, is injective, 
a3(m) = 0. Thus, m € kera3; = imaged (exactness), and hence 


7.2 Free, Projective, and Injective Modules 239 


there is an element mz € Mp) such that a2(m2) = m. Further, 0 = 
f(m) = f3(a2(m2)) = Bo(fo(m2)) (commutativity of the diagram). Thus, 
fo(m2) € kerB. = imagef, (exactness). Hence, there exists n; € N; such 
that 6;(71) = f2(mz2). Since fj is surjective, there is anelementm, € M, such 
that fi(m1) = n,. Now, fo(aidm1)) = Bi(fidm)) = Bilm) = fom). 
Since f2 is injective, aj(m;) = my. But, already a2(m2) = m. Hence 
m = a2(a;(m)) = O, for imagea; = kera. This shows that fs is 
injective. 

(ii). Suppose that fs is injective, f. and f4 are surjective. We have to show that f3 
is surjective. Letn € N3. We have to show the existence of an element m € M3 
such that f3(m) = n. Now, 63(n) € Ng. Since f4 is surjective, there is an 
element m, € M4, such that f4(m4) = 63(n). Now fs(a4(m4)) = Ba(fa(ma)) 
(commutativity of the diagram)= f4(63(n)) = 0 (exactness). Since fs is 
injective, a4(m4) = 0. Thus, mg € kerag = imagea3. Hence, there 
is an element m3 € M3 such that a3(m3) = mg. Since 63(f3(m3)) = 
fa(a3(m3)) = fa(m4) = B3(n), B3(n — f3(m3)) = 0. Thus, n — f3(m3) € 


kerB3 = image. Hence there existsnz € N2 such that 62(n2) = n—f3(m3). 


Since f> is surjective, there is an element m2 € Mp such that fo(m2) = no. 
Now n — f3(m3) = Bo(m2) = Ba(fotm2)) = f3(a2(m2)). This shows that 
f3(m3 + a@2(m2)) = n, and so f3 is surjective. 

(iii). Follows from (i) and (ii). tt 


Remark 7.2.7 The technique used in the proof of the above theorem is known as 
diagram chasing. 


Corollary 7.2.8 Consider the following commutative diagram 


Oe i a Mg, 


| | 


O__, N, —-» No —_~ N3 —+» 0 


where rows are exact, vertical arrows are homomorphisms, and the extreme 
vertical arrows are isomorphisms. Then the middle vertical arrow is also an iso- 
morphism. tt 


A short exact sequence 


O__, M,__, My__, M3__, 0 


is said to be a split exact sequence, if there exists a homomorphism ¢ from M3 
to M2 such that Bot = Iy,.The homomorphism t is called a splitting of the exact 
sequence. 


240 7 General Linear Algebra 


Proposition 7.2.9 A short exact sequence 


Go Mi ee es 


is split exact if and only if there exists a homomorphism s from Mz to M, such 
that soa = Ty,. Further, there is a bijective correspondence between the set of 
splittings of the short exact sequence and the set of all homomorphisms s from M2 
to M, satisfying soa = Ty,. 


Proof Lett be a splitting. Then Bot = Iy,. Let x € M2. Then B(x — t(B(x))) = 
B(x) — B(t(B(x))) = BO) — B(x) = 0. Hence x — t(B(x)) € kerB = imagea. 
Since @ is injective, there is a unique s(x) € M, such that a(s(x)) = x —t(B(x)). 
Using the defining property of s and the injectivity of a, it can be seen that s is 
a homomorphism from M2 to M;. Also a(s(a(y))) = ay) — t(B(a(y))) = 
a(y) (for Bow = 0). Since @ is injective s(a(y)) = y for all y € M,. Hence 
soa = Iy,. We show that the correspondence which associates an splitting t with 
s defined above is bijective. Suppose that ft; and f, are splittings which associates 
to same s. Then a(s(x)) = x —th(6(x)) = x — h(6(x)). Since B is surjective, 
t; = f2. Let s be a homomorphism from M2 to M, such that sow = TIy,. Let 
y € M3. Since £ is surjective, there is an element x € Mp2 such that B(x) = y. 
Define a binary relation ¢t from M3 to M2 by t(B(x)) = x — a(s(x)). Suppose 
that B(x,;) = B(x). Then x; — x. € kerB = imagea. Hence there exists 
z € M, such that a(z) = x; — x. Now, s(x} — x.) = s(a(z)) = z. Hence 
a(s(xj — X2)) = a(z) = x, — Xo. Thus, x; — a(s(x1)) = x2 — a(s(X2)). This 
shows that f is a map from M3 to M,. It can easily be seen that ¢ is ahomomorphism. 
Also B(t(B(x))) = B(x—a(s(x))) = B(x)— B(a(s(x))) = B(x), for Boa = 0. 
Thus, ¢ is splitting, and since x — t(6(x)) = a(s(x)), the homomorphism from M2 
to M, associated to the splitting ¢ is s. tt 


A homomorphism s such that sow = Ty, is also called a splitting. 
Let M, and M3; be R-modules. Then 


1 
0, M,__.M, @ M3_"”_, M,__, 0 


is a split exact sequence, where ij(x) = (x,0), and p2(x,y) = y. The map 
io from M3 to M; @ Ms; given by i2(y) = (0, y) is a splitting, and the associated 
splitting from M, © M; to M, is the first projection p,. 


Proposition 7.2.10 Let 


GS A a Os 


7.2. Free, Projective, and Injective Modules 241 


be a split exact sequence. Then there exists an isomorphism from M) to M, ® M3 
such that the diagram 


pB 


0__.M,___, M,____.M,; __, 0 


bs i ie 


v) 
0__,.M,__".M, My, My__,. 0 


is commutative. Further, if t and s are associated splittings, then 


0__, M,__, M,__, M,__,. 0 


is also a spilt exact sequence. 


Proof Let t be a homomorphism from M3 to M) which is a splitting, and s be the 
associated splitting. Define a map f from M, to M; © M; by f(x) = (s(x), B(x)). 
Then f is a homomorphism which makes the diagram commutative(verify). By the 
five lemma f is an isomorphism. Finally, since f is an isomorphism t = f ‘oi, 
ands = pjof. The result follows if we observe that 


Pp 
0, Mg__”,. M, ® Ms__, M,__, 0 


is split exact. tt 


Let M and N be left R-modules. Let Homp(M, N) denote the set of all R- 
homomorphisms. Let f, g € Homr(M, N). Define a map f + g from M to N by 
(f+2)(*%) = f(x) + g(x). It is easy to observe that f + g is also a member of 
Homr(M, N). This defines an addition in Hom r(M, N) with respect to which it is 
an abelian group. We may be tempted to define a module structure on Homr(M, N) 
by defining (a: f)(x) = a- f(x). Buta- f need not be amember of Homr(M, N), 
and so it will not work in general. However, if R is a commutative ring, then it is 
indeed a member of Homr(M, N), and then Homr(M, N) becomes a R-module. 
Note that every R-module is also a Z-module, and Homr(M, N) is a subgroup of 
Homz(M, N). Let f € Homz(M, N), anda € R. Define a map f -a from M to N 
by (f -a)(x) = fl(a-x). Clearly, f -a € Homz(M, N). It is easy to observe that 
Homz(M, N) is a right R-module with respect to the above right multiplication. 
Also, if R is a commutative ring, then Homr(M, N) is a right R-submodule of 
Homz(M, N). 

Let M, and Mp? be left R-modules, and w a R-homomorphism from Mj, to M2. Let 
N bealeft R-module. Then we havea map a* from Hom r(M2, N) to Homr(M,, N) 


242 7 General Linear Algebra 


defined by a*(f) = foa. It can be easily seen that a* is a group homomorphism. 
Similarly, we have a group homomorphism a, from Hom r(N, M)) to Homr(N.M2) 
given by a,(f) = aof. Let B be a homomorphism from M) to M3. We leave it 
to the reader to verify that (i) (Boa)* = a*of*, and (ii) (Boa), = B,oa,. It is 
also clear that 0* and 0, are the corresponding zero homomorphisms. Further, it is 
straight forward to see that (Jyy)* and (Jy), are the corresponding identity maps. 


Theorem 7.2.11 Hom is a left exact functor in the following sense. 


(1) If 


M,» Mz, M3—__, 0 


is an exact sequence of left R-modules, and N a left R-module, then the sequence 


* * 


B a 
0__, Hompr(M3, N)___, Hompr( Mz, N) __, Hompr(Mi, N) 


is exact. 
(ii) If 
B 


0 > My > Me > M3 


is an exact sequence, then 


Oy 3 
0,  Homp»(N,M,) > Hon Homr(N, M3) 


is also exact. 


Proof (i). Let f € Homr(M3, N) such that 6*(f) = 0. Then foB = 0. Since 6 
is surjective (exactness), it follows that f = 0. This shows that £* is injective. Next, 
since Boa = 0 (exactness of the given sequence), a*06* = (Boa)* = 0* = 0. 
Hence image 6* C kera*. Let f € kera*. Thena*(f) = foa = 0, and so 
kerf > imagea = kerf. By the fundamental theorem of homomorphism, there is 
a unique homomorphism f from M>/kerB to N such that fov = f. Also, since B 
is surjective, we have an isomorphism f from M/kerf to M3 such that Bov = B. 
Then 6*(foB ) = foB of = fov = f.Thus f € image. This shows that 
kera* = imagep*. 

The proof of (ii) is similar, and it is left as an exercise. t 


Remark 7.2.12 Hom is not a right exact functor, for even if a is injective a* need 
not be surjective, and even if 6 is surjective, 6, need not be surjective. Consider, 
for example, the multiplication f,, from Z to Z by m, where m > 1. Then f* 
from Homz(Z, Z) to itself is not surjective (verify). The quotient map v frim Z to 


7.2 Free, Projective, and Injective Modules 243 


Zm is a surjective homomorphism but v, from Homz(Zy, Z) to Homz(Zm, Zm) 
is not surjective for the simple reason that Homz(Z,,,Z) = {0}, whereas 
Homz(Zm, Zm) x Zin x {0}. 


Proposition 7.2.13 /f 
B 


Oo, Men MH. Mu 6 


is a split exact sequence, and N a module, then 


(Qe * 


a 
0 _» Hompr(Ms, N) M Home( Ma, §) —» Homr(M,N)_» 0 


and 


Oly, Me 
0 _» Home(N,M;) © Foncth MAN. Hompr(N, M3) —_» 0 


are also split exact sequence. 


Proof Lett and s be associated splittings. Then (s*oa*) = (aos)* = Ttzomp(M,,n)> 
and similarly, t,08, is also the identity map. This shows that a” and f, are surjective 
maps, and the above sequence splits. tt 


Definition 7.2.14 A left R-module P is called a projective left R-module if given 
an exact sequence 


M . N Bs 0 


(or equivalently, 6 is a surjective homomorphism from M to N), and a homo- 
morphism f from P to N, there is a homomorphism g from P to M such that the 
diagram 


P 


is commutative. 


Dually, we have 


244 7 General Linear Algebra 


Definition 7.2.15 A left R-module / is called an injective left R-module if given 
any exact sequence 


QO. N__. M 


(or equivalently, ~ an injective homomorphism), and a homomorphism f from 
N to I, there is ahomomorphism g from M to J such that the diagram 


a 


0 ,N , M 


is commutative. 


Proposition 7.2.16 Jf P is a projective R-module, then every short exact sequence 


splits. Dually, if M is injective, then also it splits. 


Proof Suppose that P is projective. Since 
NW, P_H_, 0 


is exact, and Jp isahomomorphism from P to P, there isa homomorphism ¢ from 
P to N such that Bot = Ip. Thus, t is a splitting. The rest also follows similarly. ff 


The following result follows from Propositions 7.2.10 and 7.2.13. 


Corollary 7.2.17 If the last but one term in a short exact sequence is a projective 
module, or the second term in a short exact sequence is injective module, then Hom 
takes the short exact sequence to a split exact sequence. tt 


Proposition 7.2.18 Every free R-module is projective. 


Proof Let (F(X), i) be a free R-module on X, and 6 a surjective homomorphism 
from M to N. Then B~!{(foi)(x)} 4 @ for all x € X. From the axiom of choice, 
there is a map c from X to M such that c(x) € B7'{ foi(x)} for all x € X. This 
means Boc = foi. Since (F(X), i) is a free R-module on X, there is a unique 
homomorphism ¢ from F(X) to M such that doi = c. Hence Bod and f both 
make the triangle 


7.2 Free, Projective, and Injective Modules 245 


XxX 


commutative. Since (F(X), i) is a free R-module, Bod = ff. This shows that 
F(X) is a projective module. tt 


A submodule N of a module M over aring R is called a direct summand of MV 
if there is a submodule L of M such that M = N@L. 


Proposition 7.2.19 Direct summand of a projective (injective) module is projec- 
tive(injective). 


Proof Suppose that P © Q is projective. We show that P is projective. Let 6 be 
a surjective homomorphism from M to N, and f a homomorphism from P to f/f. 
Then fop, isa homomorphism from P © Q to N. Since P @ Q is projective, there 
is a homomorphism ¢@ from P @ Q to M such that Bod = fop,. We have the 
homomorphism ¢oi; from P to M such that Bodoi; = fop,oi; = folp = f. 
This shows that P is projective. The rest can be proved similarly. tt 


Theorem 7.2.20 A left R-module P is projective if and only if it is direct summand 
of a free R-module. 


Proof Since a free R-module is projective, and direct summand of a projective mod- 
ule is projective, direct summand of a free module is projective. Next suppose that 
P is projective. From Proposition 7.2.2 we have a surjective homomorphism f from 
from F(P) to P. This gives us an exact sequence 


0 —> KerB —> F(P) — P —o. 


Since P is projective, the sequence splits. Hence from Proposition 7.2.10. P is direct 
summand of F(P). tt 


We have the following corollary. 


Corollary 7.2.21 A left R-module P is projective if and only if every short exact 
sequence 
0 — M—- N —-> P —0 


splits. 


Proof Suppose that every short exact sequence 


0 — M — N — P —0 


246 7 General Linear Algebra 


splits. Then, in particular, 


0 — KerpB —> F(P) — P —o 


splits. From the Proposition 7.2.10, P is a direct summand of F(P). From Theorem 
7.2.20, it follows that P is projective. The converse follows from the Proposition 
7.2.16. tt 


Corollary 7.2.22 Let {Py | a € A} be a family of left R-modules. Then P= 
®Xwea Py is projective if and only if each Py is projective. 


Proof If P is projective, then, since each P, is direct summand of P, each Py is 
projective. Conversely, suppose that each Py, is projective, then P, is direct summand 
of a free module F,,. But, then P is a direct summand of @ Yyeq Fy. Since direct sum 
of free modules are free, the result follows. tt 


Example 7.2.23 Every vector space (being free) is projective. It is also injective. 


Example 7.2.24 Direct sum of infinite cyclic groups are Z-projective, for they are 
free. 


Example 7.2.25. Since submodules of free modules over a P.I.D. are free, projective, 
and free modules over P.I.D. are same. In particular, all projective modules over F[X], 
where F is field, is free. It is a fact that all finitely generated projective module over 
F[X,, X2,..., Xn] is free. This fact was conjectured by J.P. Serre, and it was proved 
by D. Quillen and Suslin simultaneously, and independently in 1976. 


Example 7.2.26 Z,, is not Z-projective for it is not free. Z is Z-projective but it is 
not injective. 


Example 7.2.27 Submodule of a free module need not be free. For example, Ze is 
free over Ze, but Z3 being an ideal of Z. is a submodule, and it is not free. Also 
Ze = Z3 ® Zp, and so Zs is projective but not free. 


Example 7.2.28 Submodule of a projective module need not be a projective module. 
Za is free module over Z4, and so it is projective. However w Z» is a submodule of 
Z4 which is not direct summand of a free module, and so it cannot be a projective 
module. Quotient of a projective module need not be projective, for otherwise every 
module will become projective. 


Theorem 7.2.29 Let I be a left R-module. Then I is injective if and only if given 
any left ideal A of R and a R-homomorphism f from A (considered as R-module) 
to I, there exists a R-homomorphism f from R to I such that f/A = f. 


Proof If I is injective, then by the definition, every homomorphism from A to J can 
be extended to a homomorphism from R to [. 

Conversely, suppose that every R-homomorphism from every ideal A to J can 
be extended to a homomorphism from R to J. Let € be an injective homomorphism 


7.2. Free, Projective, and Injective Modules 247 


from a left R-module to M to a left R-module N, and ¢ a R-homomorphism from 
M to I. We have to show that ¢ can be extended to a homomorphism from N to 
I. Let X be the set of all pairs (L, w) such that L is a submodule of N containing 
&(M), and y ahomomorphism from L to J such that yoé = @. Further, X 4 9, for 
(€(M), w) belongs to X, where w(&(x)) = (x) forall x € M. Define arelation < 
on X by (L, w) < (L’, x), if L C L’ and x/L = w. Clearly, (X, <) is anonempty 
partially ordered set. Let {(Le, Wa) | a € A} be achain in X. Since union of a chain 
of submodules is a submodule, L = Ue a La is a submodule of N containing 
&(M). We have a unique homomorphism y from L to N defined by the property that 
wW/Le = We for all a. Clearly, (L, y) is an upper bound of the chain. By the Zorn’s 
Lemma X has a maximal element (Lo, Wo) (say). We show that Lo = N. Suppose 
that Lo A N, andxp € N—Lo. LetA = {A € R | Axo © Lo}. Then A, being the 
inverse image of Ly under the homomorphism A ~» Axo from R to N, is an ideal of R, 
and the map f from A to J defined by f(A) = wWo(Axo) is ahomomorphism. From 
our supposition, we have a R-homomorphism f from R to J such that f/A = f. 
Let L; = Lo+ < xo > be the submodule of N generated by Lo U{xo). Then 
Lo is a proper submodule of L;. Any element of L; is of the form u + Axo, where 
u € Lo andd € R. Suppose that uw; + Aix» = U2 + A2xXo, where uj, U2 € Lo 
and A,,A2 € R. Then (Az —A,)xq = uy — U2 € Lo. Hence Az — A; € A, and 
fQ2—A1) = fQ2—A1) = Wo(A2—-A1)x0) = Wo —u2) = Your) — Wo(u2). 
Hence f(A2)—f(A1) = Wolwi)—Wo(u2), and so Wo(ur)-+F (Ar) = Yo(ur)+ f (2). 
Thus, we have a map w from L; to J given by Wi(u+ Axo) = Wotu) + fA). 
clearly, w; is a homomorphism, and ¥/Lo = wo. Hence (Li, Wi) € X, and 
(Li, 1) > (Lo, Wo). This is a contradiction to the maximality of (Zo, Wo). Thus, 
Lo = N, and Wo isahomomorphism from WN to J such that Woog = ¢. This proves 
that J is injective. tt 


Now, we describe Z-injective modules. Recall the following: 


Definition 7.2.30 An abelian group A is called a divisible if for all a € A and 
n € Z— {0}, there isab € A suchthatnb = a. 


Corollary 7.2.31 An abelian group A is Z-injective if and only if it is divisible. 


Proof An ideal of Z is of the form mZ for some nonnegative integer m. From the 
above theorem, an abelian group A is Z-injective if and only if for each m, every 
homomorphism from mZ to A can be extended to a homomorphism from Z to A. 

Suppose that A is Z-injective. Leta € A andn € Z — {0}. The map f from nZ 
to A defined by f(nm) = ma is a homomorphism. Since A is injective, there is 
a homomorphism f from Z to A such that f/nZ = f.Now f(n) = f(n) = 
1-a = a. Suppose that f(1) = b.Thena = f(n) = n- f(1) = n-b. This 
shows that A is divisible. 

Conversely, suppose that A is divisible. Let f from nZ to A be ahomomorphism. 
Suppose that f(n) = a. Since A is divisible, there is b € A such thatnb = a. We 
have a homomorphism f from Z to A defined by f(m) = mb. Also ifm = nr, 
then f(m) = nrb = rnb = ra = f(nr) = f(m). This means that f/nZ = f. 
It follows from the above theorem that A is injective. tt 


248 7 General Linear Algebra 


Example 7.2.32 The groups (Q,+), (R,+), (C, +), and (S', -) are all divisible 
(verify), and so they are all Z-injective. 


Example 7.2.33  Homomorphic images, and also the quotients of divisible groups 
are divisible groups (verify). In particular, Q/Z ~ P is Z-injective. Submodule of 
an injective module need not be injective, for (Z, +) is not injective whereas (Q, +) 
is injective. 


Example 7.2.34 No nontrivial finite group can be divisible, for if | A | = mn and 
a # 0, then we cannot find ab € A such thatnb = a. 


We have seen that every module is quotient of a projective module. Dually, we 
show that every module is submodule of an injective module. 


Proposition 7.2.35 A left R-module I is injective if and only if for every left ideal 
A of R and every R homomorphism f from A to I, there exists ax € I such that 
f@ = axforallae A. 


Proof Suppose that / is injective, A a left ideal of R, and f a R-homomorphism from 
A to I. Then there exists a R-homomorphism ¢@ from R to J such that ¢@/A = f. 
Suppose that O(1) = x. Thend(a) = d(a-1) = a- d(C) = a-xforallae R. 
In particular, f(a) = a-x forall a € A. Conversely, suppose that sucha x € I 
exists. Then the map @ from R to J defined by ¢(a) = a-x is a homomorphism 
which is an extension of f. The result follows from the Theorem 7.2.29. tt 


Let R be aring with identity, and A be an abelian group. Then the set Homz(R, A) 
of all additive group homomorphisms from (R, +) to A is an abelian group with 
respect to the pointwise addition. Homz(R, A) becomes a left R-module with respect 
to the external multiplication - defined by (a- f)(b) = f(ba). If A is also a left 
R-module, then Homr(R, A) is a subgroup of Homz(R, A). If f €¢ Homp(R, A), 
thena- f € Homr(R, A), for (a- f)(bc) = f(bca) = bf (ca) (for f isa R— 
homomorphism)= b(a- f)(c). This shows that Homp(R, A) is a left submodule of 
Homz(R, A) for all left R-module A. 


Proposition 7.2.36 Let M be a left R-module. Then the map ¢ from Homr(R, M) 
to M defined by 6(f) = f (1) is an isomorphism of R-modules. 


Proof Clearly, @(f +g) = (ft+8)) = fU)+8) = o(f) + 4(g), 
and g(a: f) = (a: f)(l) = fd-a) = fla) = a: f() (for f isa R- 
homomorphism)= a - ¢(f). This shows that @ is a homomorphism. Next, suppose 
that 6(f) = ¢(g). Then f(1) = g(1). Since f and g are R-homomorphisms, 
f(a) = a- fd) = a-g() = g(a) foralla € R. This shows that f = g, and 
so @ is injective. Lastly, let x € M. Define a map f from R to M by f(a) = a-x. 
Then f € Homp(R, M) and ¢(f) = x. This shows that ¢ is also surjective. tt 


Proposition 7.2.37 Let A be a divisible group and R a ring. Then the left R-module 
Homz(R, A) is a left injective module over R. 


7.2 Free, Projective, and Injective Modules 249 


Proof Let B be a left ideal of R, and f be a R-homomorphism from B to 
Homz(R, A). Then the map x from B to A defined by x(b) = f(b)(1) is clearly 
a group homomorphism from (B, +) to A. Since A, being divisible, is Z-injective, 
we can extend x to a group homomorphism x from the group (R, +) to A. Now, for 
be B, f(b)A) = x) = xb) = bx(1). This show that f(b) = b- x for all 
b € B. From the Proposition 7.2.35, the result follows. tt 


The proof of the following proposition is an easy verification. 


Proposition 7.2.38 Direct sum of divisible groups are divisible. tt 
Proposition 7.2.39 Every abelian group can be embedded in to a divisible group. 


Proof Let A be an abelian group. Then A is quotient of the free abelian group F(A) 
on A. Suppose that A ~ F(A)/L. Now, F(A) is direct sum of A copies of Z, and 
so it is a subgroup of direct sum of A copies of (Q, +). Thus, A is isomorphic to 
a subgroup of quotient group of the direct sum of A copies of Q. Since direct sum 
of divisible groups are divisible, and also the quotient group of divisible groups are 
divisible, the result follows. tt 


Theorem 7.2.40 Every left R-module can be embedded ina left injective R-module. 


Proof Let M bea left R-module. From the above proposition, (WM, +) is subgroup of 
a divisible group D. Since Hom is a left exact functor, Homz(R, M) is isomorphic 
toa submodule of Homz(R, D). Since D is divisible, Homz(R, D) is injective over 
R. Also M © Homp(R, M) is a submodule of Homz(R, M). The result follows. t 


Corollary 7.2.41 A left R-module I is injective if and only if every short exact 
sequence of the type 


0 —% f. te, Fi 


splits. 


Proof If I is injective, then it is already seen that the sequence will split. Conversely, 
suppose that every such exact sequence splits. Since every module can be embedded 
in an injective module, there is an injective module M such that J is a submodule of 
M. This gives us an exact sequence 


0 — I — M — M/N — oo. 


By our hypothesis, the above exact sequence splits. Hence / is direct summand of 
an injective module M. Since direct summand of an injective module is an injective 
module, J is an injective module. ft 


Exercises 


7.2.1 State and prove the Five lemma for groups. 


250 7 General Linear Algebra 


7.2.2 Develop the concept and the theory of projective and injective groups. Try to 
characterize them. 


7.2.3 A commutative integral domain R is said to be a Dedekind domain if given 
any pair of ideals A and B of R such that B C A, there is an ideal C such that 
B = AC. For example, every PID is Dedekind domain. Z[,/—5] is a Dedekind 
domain (prove it). Indeed, if we have a subfield F of C which is a finite-dimensional 
vector space over its subfield Q (such a field is called a number field) , and R the 
set of elements of F which are roots of monic polynomials with rational coefficients 
(called the ring of algebraic integers F), then R is a Dedekind domain. Let R be a 
Dedekind domain. Let A be an ideal of R. Show that A considered as a module over 
R is a finitely generated projective module. 

Hint. Leta € A,a 40. Then Ra C A. Let B be an ideal such that Ra = BA. 
Suppose thata = bya, + boa. +--+ + Dydy. Check that (uj, u2,...,Un) bt 
Uj,a, + U2d2 +--+ + Uyzdy, is a module homomorphism with the inverse map given 
by x b> (Uy, V2,..., U,), Where vja = xbj. 


7.2.4 Let R be a Dedekind domain. Using induction on n and the fact that projection 
maps from R” to R are module homomorphisms, show that every finitely generated 
projective module is direct sum of finitely many ideals of R. 


7.2.5 A ring R with identity is called a local ring if the set M@ = R — R* of 
non-units form a left ideal of R. Show that M is a two-sided ideal which is maximal 
ideal. Deduce that R/M is a division ring. Let [a;;] be an x n matrix such that the 
matrix [a;; + M] is invertible in R/M. Show that A is invertible. 

Hint. If [a;; + M] is the identity matrix in R/M, then using elementary operations 
[a;;] can be reduced to identity matrix. 


7.2.6 Use the Exercise 7.2.5 to show that every finitely generated projective module 
over a local ring is free. 


7.2.7 R be aring with identity, and A be an x n idempotent matrix with entries in 
R. Show that R” A is a finitely generated projective module, where elements of R” 
are treated as row matrices. Conversely, show that any finitely generated projective 
module is isomorphic to such a module. 


7.2.8 Let A and B be m x m idempotent matrices with entries in a ring R. Suppose 
that there is a invertible m x m matrix P such that PAP~! = B. Show that the 
projective modules R” A and R” B are isomorphic as a module over R. 


7.3 Tensor Product and Exterior Power 


Let R be a ring with identity. Let M be a right R-module, N a left R-module, and L 
an abelian group. A map f from M x N to L is called a balanced map if it satisfies 
the following two conditions: 


7.3 Tensor Product and Exterior Power 251 


(i) The map f is additive in both the coordinates in the sense that f(x + y,u) = 
f@w+fo.u), fa,utv) = f(x,u)+ f(, v) forall x, y € M, and for 
allu,ve N. 

Gi) f(xa,u) = f(x, au) forallx e M, ae R,andue N. 


If further, M, N, and L are both sided R-modules, and in addition to (i) and (ii), 
we have f(xa,u) = af (x, u), then we say that f is a bilinear map. 

If f is a balanced map, then it follows from the additivity that f(0,u) = 0 = 
f(x, 0). 

Let M be a right R-module, and N be a left R-module. We have the following 
universal problem: 

“Does there exists a pair (L, f), where L is an abelian group, f a balanced map 
from M x N to L with the property that if (L’, f’) is another such pair, then there is 
a unique homomorphism @ from L to L’ such that dof = f’?” 

As in earlier cases solution to above problem, if exists, is unique upto isomorphism. 
For the existence, consider the free abelian group (F(M x N),i) on M x N. Let A 
be the subgroup of F(M x N) generated by the elements of the types 


(i) ia +y,u) —i(x, u) —i(y, u), 
(ii) i(x,u+v)—i(x,u) —i(x, v), 


and 
(iii) i(xa, u) — i(x, au). 


LetL = F(M x N)/A, and f = voi. We show that (L, f) is a solution 
to the above problem. Let L’ be an abelian group, and g a balanced map from 
M x N to L’. From the universal property of a free abelian group, there is a unique 
homomorphism ¢ from F(M x N) to L’ such that oi = g. Since g is a balanced 
map P(i(x+y,u)—i(x, u)—i(y,u)) = OUa+y, wW)-PUX,u)—-—bU(y, w) = 
g(xt+y,u)—g(x,u)—g(y,u) = 0. Thus, the elements of the type (i) are contained 
in the kernel of @. Similarly, elements of the types (ii) and (iii) are also contained 
in the kernel of ¢. This shows that A is contained in the kernel of ¢. From the 
fundamental theorem of homomorphism, there is a unique homomorphism 7 from 
L = F(MxN)/AtoL’suchthat yov = ¢.But,thennof = novoi = doi = g. 
This completes the proof of the fact that (L, f) is the solution to the above universal 
problem. The abelian group L is denoted by M @p N, and it is called the tensor 
product of M and N. The image f(m,n) = i(m,n)-+ A is denoted by m @n. Thus, 
(m,n) ~» m @ nis a balanced map, and hence 


Gi) ~t+y)@u = x@ut+y@u, 
Gi) x®(u+v) = x@ut+ yr, 


and 
(ili) xa @u = x @au 


forallx, ye M,aeé R,andu,veN. 


252 7 General Linear Algebra 


Also0@u = 0 = x @Oforallx e Mandue N. 

Further, if L’ is an abelian group, and g a balanced map from M x N to L’, then 
we have a unique homomorphism ¢ from M ® x N to L’ defined by the property 
p(m @n) = g(m,n). 


Definition 7.3.1 Let R and S be rings with identities. An abelian group M which 
is a left R-module, and also a right S-module is called a Bi — (R, S) module if 
(a:-x)-b = a-(x-b) forallx e M,ae R,andbe S. 


Observe that if R is a commutative ring with identity, then a left R-module M is 
also a right R-module (define x -a = a-x). In fact, itis a bi-(R, R) module. 


Proposition 7.3.2 Let M be a right R-module and N a bi-(R,S) module. Then 
M @RN has unique right S-module structure defined by (x ®u)-b = (x ®(u-b)). 
If M is bi-(S, R) module, and N a left R-module, then M @p N is a left S-module. 


Proof Let M be aright R-module, and N be a bi-(R, S) module. Let b € S. Define 
amap f, from M x NtoM@N by f(x, u) = x @ub. Itis easy to observe (using 
the fact that N is a bi-(R, S) module) that f; is a balanced map. From the universal 
property of the tensor product, we have a unique homomorphism ¢, from M @ pz N to 
itself defined by the property ¢,(x @u) = x x ub. Define an external multiplication 
on M @r N by elements of S from right by z-b = f(z) for allz ¢ M @rN, 
and b € S. Since fj is a homomorphism for all b € S, and f,,,, = fb,ofp, for all 
b,, bz € S, it follows that M ®p N is aright S-module with respect to the external 
multiplication defined above. The rest can be proved similarly. ft 


In particular, we have the following corollary. 


Corollary 7.3.3 If R isa commutative ring, then M ®p N is a both sided R-module. 
t 


Proposition 7.3.4 Let M and N be bi-(R, R) modules. Then we have a unique 
isomorphism f from M @r N to N @rM such that f(x ® y) = y@x. 


Proof The map ¢ from M x N to N @pr M defined by (x,y) = y@xisa 
balanced(in fact bilinear) map. From the universal property of the tensor product, 
we have a unique homomorphism f subject to the condition f(x ® y) = y@x. 
Similarly, we have a unique homomorphism g from N @r M to M @ N subject to 
the condition g(y ®x) = x @ y. Clearly, gof(x @ y) = x @y forallx € M and 
y EN. Since {x @ y | x € M,y € N} isa set of generators of M @pz N, it follows 
that gof = Ive,n- Similarly, fog is also the identity map. This shows that f is an 
isomorphism. tt 


In particular, we have the following corollary. 


Corollary 7.3.5 Let R be a commutative ring, and M and N be are R-modules. 
Then M ®r N is isomorphic to N ®r M. tt 


7.3 Tensor Product and Exterior Power 253 


Proposition 7.3.6 Let M be aright R-module, N a bi-(R, S) module, and L a left S- 
module. Then there is a unique isomorphism ¢ from (M@rN)®sLtoM®Sr(N@®sL) 
subject to the condition @((x ® y)®z) = x @(y ®2z) forallx ¢ M,y € N, and 
zéeL. 


Proof Let x € M. The map (y, z) ~ (x ® y) @ z defines a balanced map from 
N x Lto (M @r N) ®s L. Hence, there is a unique homomorphism ¢, from N @s5 L 
to (M @r N) @s L subject to the condition ¢,.(y ® z) = (x ® y) @ z. The map 
(x, u) ~ @,(u), where u € N @s L, is also a balanced map from M x (N ®s L) to 
(M @r N) @s L. Thus, there is a unique homomorphism ¢ from M @p (N @s L) to 
(M @r N)@s L subject to the condition d(x @(y®z)) = (x@y) @z. Similarly, we 
have a unique homomorphism w from (M @r N) ®s L to M @r (N ®s L) subject 
to the condition W((x ® y)®z) = x @(y @ z). Itis clear that @ and y are inverses 
of each other. tt 


Remark 7.3.7 The above result, in particular, says thatif R is acommutative ring with 
identity, M,, M2, ..., M, are R-modules, then the tensor product of M,, M2,..., My, 
taken in same order with respect to any two bracket arrangements are naturally 
isomorphic. Thus, we can define the tensor product M; @r M2 @r--- Sr M, 
unambiguously. It is universal with respect to n-linear maps in the sense that if 
@ is a n-linear map from M; x Mz x --- x M, to an R-module L, then there 
is a unique homomorphism wy from M, @r Mz @r--: @r M, to L subject to 
W(x, @X2@---@Xn) = P(X, X2,...,Xn)- 


Proposition 7.3.8 Let R be a ring with identity. Then there is a unique 
R-isomorphism f from R ®r M to M defined by f(a®x) = ax. 


Proof The map (a,x) ~» ax is clearly a balance map from R x M to M. Hence 
there is a unique homomorphism ¢@ from R @r M to M such that d(a @ x) = ax. 
Also the map w from M to R @pr M defined by w(x) = 1 @x isahomomorphism. 
Now, (Wod)(a® x) = Wax) = 1@ax = la@®x = a®@x. Thus, 
wod = Trem. Similarly, dow = Iy. This shows that w is a group isomorphism. 
Further, y(ax) = 1@ax = a@®x = a-(1@x) = avw(x). This shows that y 
is a R-isomorphism. tt 


Proposition 7.3.9 Let {M, | a € A} be a family of right R-modules and N a 
left R-module. Then there is a unique isomorphism ¢ from (®Xaexa Ma) @r N to 
@Xeca (Ma Sr N) such that of @n)(a) = f(a) @n. Similar result holds if N 
is a right R-module and M, is left R-module for each a. 


Proof The map @ from (@®Xec~aMa) x N to PBXaca(Ma @r N) defined by 
o((f,n))(a) = f(a) @n is easily seen to be a balanced map. Hence there is 
a unique homomorphism @ such that ( f @n\(a) = f(a) @n. The inverse map 
is an obvious map. The proof of the second part is similar. ft 


Remark A free left R-module is isomorphic to direct sum of several copies of R, 
and which are also bi-(R, R) modules. Thus, a free left (right) R-module is also a 
free bi-(R, R) module. 


254 7 General Linear Algebra 


Corollary 7.3.10 Tensor product of free left R-modules is a free left R-module. In 
turn, the tensor product P ® Q of a projective bi-(R, R) module P with a projective 
left R-module Q is a projective left R-module. 


Proof Since a free left R-module is direct sum of so many copies of R, and since 
R®R isisomorphic to R, the first part of the result follows from the above proposition. 
Further, let P be a projective bi-(R, R) module, and Q a projective left R-module. 
Then there exists a right R-module L, and a left R-module M such that such that 
P@Lisafree R-module, and Q@ M is also a free R-module. Since tensor product of 
free R-modules are free R-modules, (P®L)®(Q@M) isa free R-module. From the 
previous proposition, (P@ Q)@U is free, whereU = (P@M)@(L@Q)G(L@M). 
Hence P ® Q is a projective module. tt 


Corollary 7.3.11 Let V and W be finite-dimensional vector spaces over a field F. 
Then dim(V @r W) = dim(V)-dim(W). 


Proof Suppose that dim(V) = n, and dim(W) = m. Then V is isomorphic to 
direct sum of n copies of F’, and W is isomorphic to m copies of F’. From the above 
proposition it follows that V @ W is isomorphic to the direct sum of nm copies of 
F.. The result follows. tt 


Remark 7.3.12 Let {e1, €2,..., €n} be a basis of V, and {f1, fo, ..., fim} be a basis 
of W. Then {e; ® fj | 1 <i <n,1 < j < m} isa set of generators of V @ W, and 
so it is a basis of V ® W. 


Example 7.3.13 We show that Z,, ®z Z, is isomorphic to Za, where d is the g.c.d 
of m and n: Suppose thata = a inZ,,andb = b’ in Z,. Then m divides a — a’, 
and n divides b — b’. This means d divides a — a’, and it also divides b — b’. In turn, 
d divides ab — a'b’. Hence ab = a’b’ in Zy. Thus, we have a map f from Z,, x Z, 
to Zq defined by f(@,b) = ab. Evidently, f is a balanced map, and so it induces a 
unique homomorphism f from Zi, @z Zn to Zq such that f(@@b) = ab. Clearly, 
f is surjective. Suppose thatab = 0 in Zy. Then d divides ab. By the Euclidean 
algorithm, there are integers u and v such thatd = um + un. Since d divides ab, 
there are integers r and s such thatrm + sn = ab. But, then 


2Q@b=a8b1 = ab@l = ab@l=rmtsn®@®l=mn@l = 


This shows that kernel of f is {0}, and so f is also injective. 


Let f; be a R-homomorphism from a right R-module M, to a R-module Mo, and 
g, a R-homomorphism from a left R-module N; to a left R-module Nz. The map 
Si X gi from M, x N; to M2 @p No defined by (fi x gi)(x, y) = fil(x)@g10) isa 
balanced map(verify). Hence it induces a unique homomorphism /; ® g, called the 
tensor product of f; and g1, from M, ®r Nj to Mz @ pr N2 such that (f; ®gi)(a*@y) = 
fix) ® gi(y). Since {u @ v | u € Mo, v € No} is a set of generators of Mz @pr No, 


7.3 Tensor Product and Exterior Power 295: 


it follows that tensor product of any two surjective homomorphisms is a surjective 
homomorphism. Suppose further that f is ahomomorphism from M2 to M3, and go 
that from N> to N3. Then 


(foofi) ® (g2081) = (f2 ® g2)0(fi ® 81). 


If N is a left R-module, then the homomorphism f ® Iy from M; @r N to 
M) ®r N is denoted by f,. It is clear that (gof), = g,of,,and0, = 0. Also 
Iy ® In = Ime@gn- 

Tensoring is a right exact functor in the sense of the following theorem. 


Theorem 7.3.14 Let 


a Bp 
M,___, Mz —__ Mg, 0 


be an exact sequence of right R-modules, and N be a left R-module. Then the 
sequence 


a, fon 
M,@rN _, Mo®rN _, Ms @rN_, 0 


is also exact. 


Proof Since B is surjective, and the tensor product of surjective homomorphisms are 
surjective, 6, is surjective. Thus, we need to show that kerf, = imagea,. Again, 
since Bow = 0, wehaveO = (Soa), = f,oa,. Hence imagea, C kerf,. 
Put imagea, = L. By the fundamental theorem of homomorphism, we have a 
unique homomorphism ¢ from (M2 @r N)/L to M3 @pr N defined by ¢((m2 @n) + 
L) = B(m2) @n. It is sufficient to show that ¢ is an isomorphism. We construct 
its inverse. If B(m2) = B(m‘), then m2 — m} belongs to kerB = imagea. Hence 
(mz @n) — (m5, ®n) = (m2 — m) @ nis in L. This ensures that we have a map 
(m3,n) ~ m2 @n+ L from M3 x N to (M2 @r N)/L where B(m2) = m3. This is 
a balanced map(verify). Hence we have a unique homomorphism w from M3 @r N 
to (Mz @r N)/L such that w(m3 @®n) = mz @®n+ L, where B(m2) = m3. Now 
(pow)(m3 @n) = b(m2 @n+L) = Bl(m2) @n = mz Qn. This shows that 
gow is the identity map. Similarly, wo¢ is also the identity map. This proves that @ 
is an isomorphism, and so L = kerf,. tt 


Remark 7.3.15 Consider the homomorphism f from Z to Z defined by f(a) = Sa. 
Then f is injective but f, from Z @z Z; to itself is the zero map, for f,(m @a) = 
f(m) @®a = 5m@a = m@S5a = 0. Since Z @z Zs & Zs is nontrivial, f, is not 
injective. This shows that tensoring is not left exact. 


256 7 General Linear Algebra 


Example 7.3.16 Let A be an abelian group. Then Z,, ®z A is isomorphic to A/mA: 
Consider the exact sequence 


0 > Z > Z > Lm, » O 


where @ is the multiplication by m. Taking tensor product with A, and observing 
the fact that tensoring is right exact, we see that Z,, ®z A is isomorphic to (Z @z 
A)/kerv,. Againkerv, = imagea,.The isomorphism f from Z@z A to A given by 
f(@a) = na takes imagea, to mA. The assertion follows from the fundamental 
theorem of homomorphism. 


Let V be a vector space over a field F’. Let Q V denote s times tensor product 
of V with itself, and Q V* denote r times tensor product of the dual space V* 


with itself. Let V denote the tensor product ( V)® (@ V*). The members of 
V/ are called tensors of the type (r,s). Tensor product induces a multiplication 
inT(V) = ©@2,.V, with respect to which it is an associative algebra called the 
tensor algebra of V. The Riemann’s metric tensor is an example of a tensor of order 
(2, 0). 

Let V be a vector space over F’. Let W denote the subspace of Q V generated by 
ieee -@x, | x; = x; forsomei # j},and let (\\" V denote the quotient space 


(@ V)/W. Let us denote the coset x; @ x2 @---@x, + W by x1 A\ x2 /\\--- A x. 
The map f from V’ to (\" V defined by 


F (1, %2,.--,%7) = x [\\m A. Ax 


is r-alternating, and the pair (/\’ V, f) is universal in the sense that if g is any r- 
alternating map from V” to a space U, then there is a unique linear transformation 7 
from /\’ V to U such that 


mar f\ x2 [++ \%) = 8G 2-H) 


The pair ((\" V, f) is called the rth exterior power of V. 

If T is a linear transformation from V to W. Then the map x’T from V” to 
\\’ W defined by (x"T)(x1, x2,--.,%-) = T(x1) A To) A--- A T(z,) is anr 
- alternating map, and so it induces a unique homomorphism /\’ T from /\’ V to 
/\’ W which takes x; A x2 A --- A x; to T(x) A T(r) A --: A TQ;). It is easy 
to verify that \"(T’0T) = (\" T')o((\' T), and A" Ty = Ty; y- In particular, if 
T is an isomorphism, then /\" T is an isomorphism for all r. 

If V is vector space of dimension n, then any m-alternating map on V form > n 
is zero map (for an r alternating map takes any linearly dependent r tuple to 0). Thus, 
we have the following proposition. 


7.3 Tensor Product and Exterior Power 257 


Proposition 7.3.17 Let V be a vector space of dimension n. Then [\" V = {0} for 
allm > n. tt 


Theorem 7.3.18 Let V be a vector space of dimension n. Then dim \" V = 1. 


Proof Let {e;,e2,...,é@n} be an ordered basis of V. If f is an n-alternating 
map, then for any ordered n-tuple {x1,%2,..., Xn}, f(%1,%2,---,Xn) = detA- 
f(€1, €2,-.-,€n), Where x; = Li_,ajje;,and A = [a,;]. Thus, f is determined 
uniquely by its value f(e1, e2,..., @,). This shows that dim \" V is at most 1. Also 
the map f defined by f(x1, x2,...,Xn) = detA defines a nonzero n-alternating 
map(indeed, f (e1, €2,...,@n,) = 1). This shows that the dimension of /(\" V is 1. £ 


Let V be a vector space of dimension n, and T a linear transformation on V. 
Then /\" T is a linear transformation on /\" V. Since /\" V is of dimension 1, the 
linear transformation /\" T is multiplication by a scalar. This scalar is called the 
determinant of 7’. It is easy to observe that this definition of determinant agrees with 
the definition of determinant in the Chap. 5. 


Theorem 7.3.19 Let V be a vector space of dimension n, and r <_ n. Then 
dim \' V = "C,. 


Proof For r =n, the result is the content of the above theorem. Let {e), e2,..., ey} 
bea basis of V. Then as observed in the above theorem e; /\ e2 /\ --+ /A\ én isnonzero. 
Consider the subset S = {e;, \ ei, \--- A ei, | in < i2 < ++: < @,}of A’ V. 
Clearly, every member of N V is a linear combination of members of , and so it 
is a set of generators. We show that S is linearly independent. Suppose that 


Di <ip<--<i, Qizip-i, (Gi, \ Ci, \ vee \ é,) = 0. 
Fix jy < jo < --+ < j,. Suppose that 
{1,2,...,2} — {diy Jase++s Sr} = (Frat, deta, +s In}- 
Taking the exterior product with j-41 A jr+2 A ++: A Jn we obtain that 
dinmien [\ern Mo en A ein Ao (ein = © 
Since jx A ji for all k J, it follows that 


en \en \- Aen = teria Aen) 40. 


Hence dj, ;,...;, = 0. This shows that S is linearly independent, and so it is a basis. 
Clearly, the number of elements in S is ”C,. tt 


The exterior product gives us an external multiplication from A" V x /A\* V 
to ea V, and this can be extended linearly to a multiplication on E(V) = 


258 7 General Linear Algebra 


®=u~_(A’ V) with respect to which it is an associative algebra, and it is called 
the exterior algebra of V. Also, 


r 


dim(E(V)) = D_pDim [\ V2 tly & 2 oe eG, a 


Exercises 
7.3.1 Show that Z; ®z Z; is the trivial group. 


7.3.2 Show that A ®z Z,, is trivial whenever A is divisible. Deduce that Q ®z Z,, 
is trivial. 


7.3.3 Let A be an abelian group of exponent m. Show that A @z Z,, is isomorphic 
to A. 


7.3.4 Show that tensoring takes a split exact sequence to a split exact sequence. 
7.3.5 Find Q @z Q. 


7.3.6 Let R be a commutative ring. Show that Homr(A ®pr B, C) is isomorphic to 
Hompr(A, Home(B,C)). 


7.3.7 Show that the definition of determinant in this section agrees with the defi- 
nition of determinant given in the previous chapter, and establish all properties of 
determinant using this definition. 


7.3.8 Let V be a vector space of dimension at least 2. Show that a linear transfor- 
mation T on V is an isomorphism if and only if A" T is an isomorphism for all 
r>2. 


7.4 Lower K-theory 


In this section, we shall introduce and discuss the functors Ko and K, from the 
category of rings to the category of abelian groups. Let R be a ring with identity. 
Let 9 (R) denote the set of isomorphism classes of finitely generated projective left 
R-modules (Note that so (R) is, indeed, a set). The isomorphism class of projective 
module determined by P will be denoted by [P]. Let Ko(R) denote the abelian group 
generated by so (R) subject to the relation 


[P] + [Q] = [P® Ql]. 


More precisely, Kog(R) = F/N, where F is the free abelian group with basis 
g(R), and N is the subgroup of F generated by the set of elements of the 
type [P] + [Q] — [P @ Q]. The group Ko(R) is also called the Grothendieck 


7.4 Lower K-theory 259 


group of the ring R. It is also called the Grothendieck group of the category Pr of 
finitely generated left projective modules over R. The coset [P] + N is denoted by 
< P >. Thus, the elements of the type < P > generate Ko(R). Clearly, any element 
of Ko(R) is expressible as << P> — <Q>. 


Definition 7.4.1 Two finitely generated projective R-modules P and Q are said to 
be stably isomorphic, if P 6 R” is isomorphic to Q @ R” for some n. A projective 
R-module P is said to be stably free if it is stably isomorphic to a free R-module. 


Remark 7.4.2 Clearly, isomorphic projective modules are stably isomorphic. How- 
ever, two stably isomorphic projective modules need not be isomorphic. For example, 
let V be an infinite-dimensional vector space over a field F,, and R the ring of endo- 
morphisms of V. Then the module R over R is isomorphic to the module R © R 
(check it). As such, the trivial module is stably isomorphic to the module R, but R 
is not isomorphic to the trivial module. 


Proposition 7.4.3 < P >= < Q > ifand only if P and Q are stably isomorphic. 
Inturn, << P> — <Q>=<P'> — < Q'> ifandonly if P © Q' is stably 
isomorphic to P’ ® Q. 


Proof Suppose that P is stably isomorphic to Q. Then [P © R”] = [Q © R"] 
for some n. This means that < P > + < R"” >=< Q>4 < R" >. This 
shows that < P > = < Q >. Conversely, suppose that << P > = < Q >. Then 
[P]—[Q] e€ N. Since N is the subgroup of F generated by the elements of the 
type [P] + [Q] — [P @ Q], there exist elements [P;],[Q;],i = 1,2,...,n”, and 
[Lj],[Mj], 7 = 1,2,...,m in F such that 


[P] —[Q] = YL, P]+0O1 — [P: 8 Qi) — LiL (Lj]+(Mj] — [L; 6 Mj) 
in F’. Equivalently, 


[P] + DLP ® Qi) + UL CLj]+(Mj)) = 
[Q] + Vil [Lj @ Mj] + D_,CP1 +i). 


Since F is free abelian on 9 (R), there is a bijective correspondence between the set 
of terms in the sum of the LHS to the set of terms in the sum of RHS so that the 
corresponding terms represent same elements in 49 (R). This ensures the existence of 
a finitely generated projective module U such that P © U is isomorphic to Q @ U. 
Since U is projective, there is a module V such that U © V is isomorphic to R” for 
some n. But, then P © R” is isomorphic to Q © R”. This shows that P is stably 
isomorphic to Q. The rest follows immediately. tt 


Let f be a homomorphism from a ring R; to a ring R2. A ring homomorphism 
is always assumed to preserve the identities of the rings. Ry can be treated as a 
right R,;-module by defining r-a = rf(a),r € Ro,a € R,. In fact, then Rp isa 
bi — (Ro, R;) module. If M is a left R,-module, then Rz @r, M is a left Rp-module. 


260 7 General Linear Algebra 


We denote this R2-module by f;(M). Since tensor product distributes over direct 
sum, the following assertions can be easily verified. 


(i) fx(P © Q) © fx(P) ® fx(Q), 
(ii) f, takes finitely generated modules to finitely generated modules, 
(iii) f; takes free modules to free modules, 
(iv) f, takes projective modules to projective modules, 
(v) f; defines a map from go (R1) to 9 (R2), and it respects the relation[P]+[Q] = 


[P ® QI. 


In turn, f induces a homomorphism Ko(f) from Ko(R;) to Ko(R2) given by 
K\(f\(< P>-<Q>) =< f,(P) > — < f:(Q) >. Further, G@) KoUr) = 
Tk,(r), and (ii) Ko(gof) = Ko(g)oKo(f). In the language of category theory, it 
says that Ko is a functor from the category of rings to the category of abelian groups. 

If R isacommutative ring, P and Q are finitely generated projective modules, then 
P® Qisagaina finitely generated projective module. Since tensor product distributes 
over direct sum, we have a product - on Ko(R) given by (< P > — < Q>)-(< 
P'>-<Q>) =<P@Q>+<08080>-<P®OU>-<OQ®@P' >. 
It follows that Ko(R) is a commutative ring. Thus, Ko defines a functor from the 
category of commutative rings to the category of commutative rings also. 


Proposition 7.4.4 Let R be a ring such that the following hold: 


(i) R” is isomorphic R" if and only ifn = m. 
(ii) Every finitely generated projective module is free. 


Then Ko(R) is isomorphic to the group of integers. 


Proof Under the hypothesis of the proposition o(R) = [R"], and R” is stably 
isomorphic to R” if and only if n = m. Thus, [R"] = < R” >. The map 7 from 
the set {< R” > |n € N} to N defined by n(< R" >) = nis a bijective 
map such that < R” ® R™ >= n(< R" >)+n(< R” >). Suppose that 
< R" > — < R" >= < R' > — < R° >. Then < R" > = < R"* 5. This 
means thatn +s = m-+r. Thus, 7 can be extended to a map 7 from Ko(R) to Z 
given by 7(< R” > — < R” >) = n—~m. Clearly, 7 is an isomorphism. tt 


Corollary 7.4.5 (i) If D is a division ring, then Ko(D) is the group of integers. 
(ii) If R is a principal ideal domain, then Ko(R) is the group of integers. 
(iii) If R is a local ring, then Ko(R) is the group of integers. 


Proof In each of the cases, the hypothesis of the above proposition is satisfied (for 
local ring see Exercise 5, 6, and 7 of the Sect. 7.2. tt 


Now, we introduce the functor K, from the category of rings to the category of 
abelian groups. Let R be a ring with identity. Let us denote by GL(n, R) the group of 
invertible m x n matrices with entries in R. There is a natural embedding of GL(n, R) 
into GL(n + 1, R)given by 


74 Lower K-theory 261 


Under this embedding, GL(n, R) can be treated as a subgroup of GL(n + 1, R). We 
get a chain of groups 


GLU, R) CGL@,R) C++» GL(n, R) CGL(n+1,R) C------ 


The union of this chain is a group denoted by GL(R), and it is called the gen- 
eral linear group over the ring R. The elementary n x n matrices E iF (also called 
transvections) are members of GL(n, R). The Steinberg relations still hold among 
these matrices. Let E(n, R) denote the subgroup of GL(n, R) generated by these 
elementary matrices. We have a chain 


E(1,R) CEQ, R)C---C En, R) CE +1,R) C------ 


of subgroups of GL(R). The union E(R) of this chain is a subgroup of GL(R). 
Recall that for a field F', every matrix of determinant | can be reduced to the iden- 
tity matrix using the elementary operations corresponding to transvections. In other 
words, the special linear group SL(F’) C E(F). Already, the elements of E(F) are 
of determinant |. Thus, SL(F) = E(F). 


Proposition 7.4.6 E(R) isaperfect group in the sense that|E(R), E(R)] = E(R). 


Proof One of the Steinberg relation is [E};, E4;] = E;" Taking 4 = 1, we observe 
that every transvection is a member of [E(R), E(R)]. Hence, [E(R), E(R)] = 
E(R). tt 


Proposition 7.4.7 Matrices of the type 
IA 10 d 0-1 
Orato ir a | 


Proof The result follows, if we observe that the matrices described in the proposition 
can be reduced to the identity matrix by applying the elementary operations associated 
to the matrices Ei. tt 


are members of E(R). 


Corollary 7.4.8 Let A € GL(R). Then the matrix 


oe) 


is a member of E(R). 


262 7 General Linear Algebra 


Proof Follows from the above proposition and the identity 
A 0 _|IA I Oj;| FAY) O-T 
OAT] [OT][-AtTI][or] {7 0 |: 


Lemma 7.4.9 Whitehead Lemma. [GL(R),GL(R)|] = E(R). 


Proof Already E(R) = [E(R), E(R)] C [GL(R), GL(R)]. Thus, it is sufficient 
to observe that any commutator ABA~!'B™! in GL(n, R) treated as an element of 
GL(2n, R) can be expressed as 


4x7 | Ae 0 BO (BA)"' 0 
BA eGo || out || 0 wae 
t 


Definition 7.4.10 The abelian group GL(R)/E(R) is called the Whitehead group 
of the ring R, and it is denoted by K;(R). 


Thus, the Whitehead group K;(R) can be viewed as the group of equivalent matrices 
in GL(R), where two matrices A and B in GL(R) is said to be equivalent one can be 
obtained from the other by using elementary operations associated to transvections. 
For example, over fields, two matrices are equivalent if and only if they have same 
determinant. Note that a nonsingular matrix A with entries in a field F can be reduced 
to the matrix diag(1, 1,..., 1, det A) by using the elementary operations associated 
to transvections. 

If f is a homomorphism from a ring Rj to aring R>, then it induces a map from 
M,,(R;) to M,(R2) which takes A = [a,j] to f(A) = [bj], where bj; = f (aij). 
In fact, it maps GL(R)) to GL(R2), and E(R,) to E(R2). In turn, it induces a 
homomorphism K,(f) from K,(R1) to K;(R2). It can be easily observed that (i) 
Ki(gof) = Ki(g)oK,(f), and (ii) KiUr) = T[x,:r). In the language of category 
theory, K, defines another functor from the category of rings to the category of 
abelian groups. 

If R is a commutative ring, then determinant of a square matrix with entries in R 
makes sense, and then every element of E(R) is of determinant 1. Thus, E(R) C 
SL(R). In general E(R) ~ SL(R). We denote the group SL(R)/E(R) by SKi(R). 
It follows that 

K\(R) = SK\(R) ® U(R), 


where U(R) is the group of units of R. For most of the commutative rings R, 
SL(R) = E(R), and in such cases K;(R) = U(R). For example, if R is a 
Field, or a Local ring, or an Euclidean domain, or the ring of integers in a num- 
ber field, the matrices of determinant | can be reduced to the identity matrix by 
using elementary operations associated to the transvections E he Thus, in these cases 


7.4 Lower K-theory 263 


SL(R) = E(R), and so Ki(R) = U(R). However, there are commutative rings 
in which a matrix of determinant 1 may not be reducible to the identity matrix by 
using elementary operations associated to transvections. For example, consider the 
ring R = R[x, y]/T, where I is the ideal generated by x? + y? — 1. Then the 


matrix 
A= E eal 
y x 


where x = x+T andy = y+T isa matrix of determinant |. Using topological 
arguments (see “Algebraic K-Theory” by Milnor, p. 58), it can be shown that no 
nontrivial power of A can be in E(R). Thus, SK,(R) contains an element of infinite 
order. 


Exercises 
7.4.1 Compute Ko(Z¢), and also K;(Ze). 


7.4.2 Find Ko(R), where R is the ring of endomorphisms of an infinite-dimensional 
vector space V. 


7.4.3, Determine Ko(M2(C)). 
7.4.4 Show that Ko(R x R>) y Ko(R}) x Ko(R2). 


7.4.5 Determine Ko(Z[i]) and K,(Z[i]). 


Chapter 8 
Field Theory, Galois Theory 


This chapter is devoted to the theory of fields, Galois theory, geometric construc- 
tions by ruler and compass, and the theorem of Abel—Ruffini about the polynomials 
equations of degree n, n > 5. We also discuss cubic and biquadratic equations. 


8.1 Field Extensions 


Let K be a subfield of a field L. Then we say that L is a field extension of K. The 
notation L/K is used to say that L is a field extension of K. If L is a field extension of 
K, then (LZ, +) is a vector space over K (the multiplication by scalars being the field 
multiplication). If the dimension of (LZ, +) over K is infinite, then we say that L is 
infinite extension of K. If the dimension of (L, +) over K is finite, then the dimension 
of (L, +) over K is called the degree of the extension, and it is denoted by [L: K]. 


Proposition 8.1.1 Let K be a finite field. Then the number of elements in K is p" for 
some prime p, and for somen € N. 


Proof Since K is a finite field, its characteristic is a prime p. The map i ~~ i- 1 is an 
injective homomorphism from the field Z, to the field K. Thus, Z, can be considered 
as a subfield of K, and since K is finite, it is a finite-dimensional vector space over 
Zp. Suppose that the dimension of K over Z, is n. Then K, as a vector space over 


Zy, is isomorphic to Z;,. This shows that | K | = p”. tt 


Proposition 8.1.2 Let L/K and K/F be finite field extensions. Then L/F is also 
finite field extension, and[L: F| = [L: K][K: F]. 


Proof Suppose that [ZL : K] = nand[K: F] = ma. Let {x,,x,...,x,} bea 
basis of the vector space L over K, and {y;, y2,...,¥m} be a basis of K over F. 
We show thatS = {xy | 1 <i <n,1 <j < m} isa basis of L over F. Let 
x € L. Since {x1, x2, ...,X,} is a basis of L over K,x = ayx, + doxX2 +--+ + anXy 
for some dj, d2,...,d@, in K. Further, since {y,, y2,..., Ym} is a basis of K over 
© Springer Nature Singapore Pte Ltd. 2017 265 


R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_8 


266 8 Field Theory, Galois Theory 


Fra, = Di iY for some aj € F. Thus, x = %j ;ajixiy;. This shows that $ 
generates the vector space L over F’. Now, we show that S is linearly independent over 
F. Suppose that &j ;ajxjy; = 0. Then DL (Ui iY) Xi = 0. Since {x}, x2,..., Xn} 


is linearly independent over K, we | QyiYj = Ofor alli. Again, since, {y1, y2,..., Ym} 
is linearly independent over F’, aj; = 0 for all j, i. tt 


Corollary 8.1.3 Let L be a finite extension of K, and a F a subfield of L containing 
K. Then [L: F\/[L: K], and also [F : K]/[L: K]. tt 


Corollary 8.1.4 Let L be a finite field containing p" elements. Let K be a subfield 
of L. Then K contains p™ elements, where m divides n. 


Proof Since K is a subfield of L, charK = charL = p. Thus K contains p” 
elements. Suppose that [L : K] = r. Then L, as a vector space over K, is isomorphic 
to K", andso pp” =|L|= (p”)’ = p’”’.Hencen = mr. ft 


Let L be a field extension of K. Let S be a subset of L. The subring of L 
generated by K (J S will be denoted by K[S]. Clearly, K[S] is the intersection of 
all subrings of L containing K J) S. The subfield of L generated by K LS is the 
intersection of all subfields containing K () S, and it is denoted by K(S). Clearly, 
K(S) is the field of fractions of K[S]. If S is finite, and K($) = L, then we 
say that L is a finitely generated field extension of K. The ring K[S] is called the 
finitely generated ring extension of K. 

Let S = {aj,Q2,...,Q,} be a finite subset of L. Let f(X1, X2,...,X,) be 
a polynomial in K[X,, X2,...,X,]. Then f(a,, a2,..., Q,) denotes the element 
of L which is obtained by substituting a; at the place of X; in the polynomial 
f(®%, X2,..., Xn) for each i. It is clear that f(a), a2,..., @,) belongs to each sub- 
ring which contains K |) S. Consider the map 7 from K[X,, X2,..., X,] to L defined 
by n(f (X1, X2,.--,Xn)) = flair, a2,..., A). Clearly, the map 7) is a ring homo- 
morphism whose image is the subring of L generated by K (J S. This subring is 
denoted by K[a1, Q2,...,a,]. By the fundamental theorem of homomorphism, 
K[X, X2,..., Xn] /kern is isomorphic to K[aj, Q2,..., Qn]. If kern = {0}, then 
we say that the set {a1, a2,...,Q@,} is algebraically independent. More explic- 
itly, {a1, @2,..., @,} is said to be algebraically independent if there is no nonzero 
polynomial f(X, X2,...,X,) such that f(aj,Q2,...,Q,) = 0. It is said to be 
algebraically dependent otherwise. It follows that K[a,, a2, ..., @,] is the subfield 
K(a1, Q2,..., Q) of L generated by K U{ai, a2,..., an} if and only if the kern is 
a maximal ideal. We shall see that this is, indeed, the case if and only if each a; is a 
root of a nonzero polynomial in K[X]. Such elements are called algebraic elements 
over K. We first consider the case whenn = 1. 

Let L be a field extension of K and a ¢€ L. Consider the map 7 from K[X] to 
K[a] defined by n(f(X)) = f(a). Then 77 is a surjective ring homomorphism. Thus, 
by the fundamental theorem of homomorphism K[X]/kern is isomorphic to K[a]. 
There are two cases: 


8.1 Field Extensions 267 


(i) kern = {O}. 
(ii) kern  {O}. 


In case (i), 7 is an isomorphism from K[X] to K[a], and we say that a is a 
transcendental element over K. More explicitly, a is a transcendental element over 
K if ais not a root of any nonzero polynomial f(X) € K[X]. For example, 7 and the 
exponential e are transcendental over Q. 

In case (ii), there is anonzero polynomial f (X) € K[X]suchthatf(a) = 0, andin 
this case we say that a is an algebraic element over K. For example, a primitive cube 
rootw = e? of unity, being a root of the nonzero polynomial X* + X + 1 € Q[X], 
is algebraic over Q. 

Suppose that a is an algebraic element over K. Then kern A {0}. Since K[X] 
is a PLD, kern is a nontrivial principal ideal. Suppose that kern = < p(X) > = 
K[X]p(X). Let p’(X) be another polynomial in K[X] such that kery) = < p'(X) >= 
K[X]p'(X). Then p(X) divides p’(X), and also p'(X) divides p(X). In turn, p(X) = 
ap'(X) for some nonzero element a € K. Thus, there is a unique monic polynomial 
P(X) € K[X] such that kern = < p(X) >. The unique monic polynomial p(X), 
thus obtained, is called the minimum polynomial of a over K, and it is denoted by 
ming (a) (X). Clearly, the minimum polynomial ming (a) (X) of a over K is the monic 
polynomial of smallest degree having a as a root. Conversely, suppose that f (X) is a 
monic polynomial of smallest degree having a as a root. Then ming (a) (X) divides 
J (X). By the division algorithm, there exist polynomials q(X) and r(X) such that 
ming (a)(X) = q(X)f(X)+r(X), where r(X) = Oorelse deg(r(X)) < degf(X). 
Hence r(a) = 0. This implies that r(X) = 0, for other wise we shall arrive at a 
contradiction to the supposition that f(X) is the smallest degree polynomial having 
a as a root. Hence f(X) divides ming (a)(X). It follows that f(X) = ming (a)(X). 


Proposition 8.1.5 Let L bea field extension of K, anda € L be analgebraic element 
over K. Then the minimum polynomial ming (a)(X) of a over K is an irreducible 
polynomial in K[X], and the ideal < ming(a)(X) > = K[X]ming(a)(X) is a 
maximal ideal of K[X]. 


Proof Suppose that ming (a)(X) = f(X)g(X), where f(X) and g(X) are noncon- 
stant polynomials in K[X]. Then degf(X) < degming(a)(X), and also degg(X) < 
degminx(a)(X). Further,0 = ming(a)(a) = f(a)g(a). Since L is a field, 
f(@) = Oorg(a) = O. But, then ming (a)(X) divides f(X), or it divides g(X). 
This is a contradiction. Hence ming (a)(X) is an irreducible polynomial. Since the 
ideal generated by an irreducible element in a principal ideal domain is a maximal 
ideal, < minx (a@)(X) > is a maximal ideal. tt 


Proposition 8.1.6 Let L be a field extension of K, and a € L. Then a is algebraic 
if and only if K[a] = K(a). Further, then [K(a) : K] = degming(a)(X). tt 


Proof Suppose that a@ is algebraic. The map 7 from K[X] to K[a] given by 
n(f(X)) = f(a) is a surjective homomorphism. From the fundamental theorem 
of homomorphism, K[X]/kern is isomorphic to K[a]. By the above proposition, 
kern = < ming(a)(X) > is a maximal ideal. Hence K[a] is a field, and so 


268 8 Field Theory, Galois Theory 


K[a] = K(qa). Conversely, suppose that K[a] = K(a). Then K[a] is a sub- 
field of L. Hence a! € K[a]. Suppose that 


a! =ay + aya + ao? + ++» + ana", 


where at least one a; is nonzero. Clearly, 
f(X) = -1 + aX + Gt Soin aaax 


is a nonzero polynomial in K[X] such that f(a) = 0. It follows that a is algebraic. 
Finally, suppose that a is algebraic, and 


ming) = ag + ax + ek pwd ak 


It is sufficient to show that the set S = {l,a, a?,...,a"—!} is a basis of K(a) 
over K. Since K(a) = K[a], every element of K(q) is of the form f(a), where 
f(X) is a polynomial in K[X]. By the division algorithm, there exist polynomials 
q(X) and r(X) such that f(X) = q(X)ming(a) + r(X), where r(X) = 0, or else 
degr(X) <n—1. But, thenf(a) = r(a). Thus, f(a) = r(q) is linear combination 
of members of S. This shows that S' generates the vector space K(a) over K. Next, 
suppose that 


ay + aja + aa? +--+ a a"! = 0, 


where a; € K for each i. Then each aq; is 0, for otherwise we get a nonzero polynomial 
g(X) = ay + aX + aX? + +++ ayy X"! 


of degree less than the degree of minx (@)(X) such that g(a) = 0. This shows that 
S is also linearly independent. tt 


Proposition 8.1.7 Let L be a field extension of K. An element a € L is algebraic 
over K if and only if K(q) is a finite extension of K. 


Proof Suppose that a is algebraic over K. From the above proposition, it follows 
that K(q) is a finite extension of K. Conversely, suppose that K (a) is a finite exten- 
sion of degree n over K. Then the dimension of K(q@) over K is n. Hence the set 
{l, a, a2,..., a"} is linearly dependent. Thus, there exist do, d1,...,d, not all 0 
such that 

ag + aya + ao? + --- aio" = 0. 


8.1 Field Extensions 269 
This gives us a nonzero polynomial 


g(X) = ay + ayx + aX? + --- aX" 


such that g(a) = 0. It follows that a is algebraic over K. tt 
Proposition 8.1.8 Let L be a field extension of K, and a,, a2,..., A, be elements 
of L which are algebraic over K. Then K(aj, a2,..-, Qn) = K[a1,Q2,..., Q,]. In 


particular, kern is a maximal ideal of K[X\, X2,..., Xn]. 


Proof The proof is by the induction on n. For n = _ 1, the result is the above 
proposition. Assume that the result is true for n. Let a1, a2, ..., Qn, An+1 be elements 
of L which are algebraic over K. By the induction hypothesis, 


K[q, Q2,..+, Anti] = K[q, Q2,-.-, A |[On41] = K(a, Q2,..-, An) [An+1]. 


Since @,+1 is algebraic over K, it is also algebraic over K (a1, Q2,..., @,). From the 
above proposition, 


K(qy, OQ, . +6, An) LOn+1] = K(ay, OD, wo +5 On+1)- t 


Proposition 8.1.9 Let L be afield extension of K, and F be a subfield of L containing 
K. Let a € L be algebraic over K. Then a is algebraic over F, and ming(a)(X) 
divides ming (a)(X) in F[X]. 


Proof Since ming (a)(X) belongs to K[X], it also belongs to F[X]. Again, since a 
is a root of ming (a@)(X), the result follows. tt 


Definition 8.1.10 A field extension L of K is said to be an algebraic extension of 
K if every element of L is algebraic over K. 


Evidently, we have the following proposition: 


Proposition 8.1.11 Let L be an algebraic extension of K and F be a subfield of L 
containing K. Then L is an algebraic extension of F. ft 


Proposition 8.1.12 Let L be a finite extension of K. Then L is an algebraic extension 
of K. 


Proof Let L be a finite extension of K anda € L. The [K(a): K] < [L: K] < o. 
Thus from the above proposition a is algebraic over K. tt 


Proposition 8.1.13 Let L be a field extension of K. Let Lo be the set of all algebraic 
elements of L over K. Then Lo is a subfield of L containing K which is the largest 
algebraic extension of K contained in L. 


270 8 Field Theory, Galois Theory 


Proof Everyelementa € K is algebraic over K, foritis arootof X — a.Thus K C Lo. 
Let a, 3 € Lo. Since a is algebraic over K, K(q) is a finite extension of K. Further 
since 3 is algebraic over K, it is also algebraic over K(a). Thus K(a, 3) = K(a)((@) 
is a finite extension of K(q). It follows that K (qa, (3) is also a finite extension of K. 
Hence K(a, 3) is an algebraic extension of K. Hence a + 6 and a- 37! are also 
algebraic over K, and so both of them belong to Lo. This shows that Lo is a subfield. 
Clearly, this is the largest field contained in L which is algebraic over K. ft 


Definition 8.1.14 The subfield Lo of L in the above proposition is called the 
algebraic closure of K in L. 


Corollary 8.1.15 Let L be an algebraic extension of F, and F be an algebraic 
extension of K. Then L is an algebraic extension of K. 


Proof Let a € L. Since L is algebraic over F, a is algebraic over F’. Let 
ming (a)(X) = Xx" + a,x"! + +++ + Ay. 


Then a is algebraic over F’ = K (aj, az, ...@y). Clearly, F’(q) is a finite extension 
of F’, and [F’(a) : F’] = n. Further, since a, is algebraic over K, K(a,) is a finite 
extension of K. Again, since az is algebraic over K, and so also over K (a1), it follows 
that K(a,)(a2) = K(a,, a2) is a finite extension of K(a,). In turn, it follows that 
K (aj, a2) is a finite extension of K. Proceeding inductively, we find that F’ is a finite 
extension of K. But, then F’(q) is also a finite extension of K. Hence every element 
of F(a) is algebraic over K. In particular a is algebraic over K. tt 


Definition 8.1.16 An extension L of K is called a simple extension if there is an 
element a € LsuchthatL = K(qa). Such an element a is called a primitive element 
of the extension. 


Theorem 8.1.17 Let L be a finite extension of K. Then L is simple over K if and 
only if there are only finitely many intermediary field between L and K. 


Proof Suppose that L = K(q) is a finite simple extension of K. Let F be a subfield 
of L containing K. Then a is algebraic over K and also over F’.. Clearly, minr(a)(X) 
is a divisor of ming (a)(X). We show that F is uniquely determined by the factor 
ming (@)(X) of ming (a)(X). Suppose that 


ming(a)(X) = ao + aX + aX? + +» + aq X™!) + X", 


where a; € F. Consider the subfield K'’ = K(ado,a),...d@n—1) of F. Then 
ming (@)(X) = ming(a)(X), and so[L : F] = [L: K’]. Hence F = K’. 
This shows that F is uniquely determined by minr(a)(X). Since the number of 
monic polynomial divisors of ming (a)(X) are finitely many, we have only finitely 
many intermediary fields between L and K. 

Conversely, suppose that there only finitely many intermediary fields. Then we 
have to show that L is a simple extension of K. Suppose first that K is a finite field, 


8.1 Field Extensions 271 


and L is a finite extension of K. Then L is also a finite field. Further, we know that 
the multiplicative group L* of nonzero elements of L is cyclic. Suppose that it is 
generated by a. Then it is clear thatL = K(q). 

Assume, now, that K is infinite. Since [L : K] is finite, L is finitely generated 
extension of K. Suppose thatL = K(qj, Q2,..., @,). The proof is by the induction 
onn.Ifn = 1, then there is nothing to do. Suppose that the result is true for 
n. Suppose thatL = K(qaj,Q2,..., Qn, Qn+1). Then, by the induction hypothesis 
K(qq, Q2,...,Q,) = K(qa) for some a € L. Clearly, L = K(a, a,41). Consider 
the set {K (aa + +41) | a € K}. Since the there are only finitely many intermediary 
fields between K and L, and K is infinite, there are distinct elements a, b € K such 
that K(aa + Qn41) = K(ba+an41) = F (say). But, then a = ((aa+ 
Ant1) — (ba + An41))(a — b)~! belongs to K(aa + ay»41). This also shows that 
Ont € K(aa+ Qn41). Hence L = K(a,Qn4;) = K(aa+ a,41). The proof is 
complete. ft 


Now, we have some examples. 


Example 8.1.18 The field C of complex numbers is an extension field of R. Since 
{1, 7} is a basis of the vector space C over R, we have [C : R] = 2. The field R 
of real numbers is an extension of the field Q of rational numbers. The dimension 
of R considered as a vector space over Q is infinite(since Q” is countable, any 
finite-dimensional vector space over Q is countable, but R is uncountable). J2 is an 
element of R which is algebraic over Q, for it is a root of the polynomial X? — 2. 
Also, since X* — 2 is the smallest degree monic polynomial over Q of which 2 is a 
root, ming(/2)(X) = X?—2. Further Q(./2) = Q[V2] = {at+bV2| a,b € Q. 
Note that [Q(/2) : Q] = 2. 


Example 8.1.19 Let w be a primitive cube root of unity. Then w is algebraic over 
Q, for it is a root of the equation X* — 1. Further, since w ¢ Q, and it is also a root 
of X? + X + 1, it follows that X? + X + 2 is the minimum polynomial of w over Q. 
Clearly, QW) = Q[w] = {a+ bw | a,b € Q}, and [QW) : Q] = 2. 


Example §.1.20 /2 and w are algebraic over Q. Hence QI/2,w] = Q(/2,w) 
is a finite extension of Q. We show that it is a simple extension of Q(in fact we 
shall show later that every finite extension of a field of characteristic 0 is simple). 
We find an element a € C such that Q(a) = Q(/2,w). Observe that /2 + w 
belongs to Q(./2, w). Puta = 2+ w. We show that Q(a) = Q(V2,w). We 
have (a—w)? = 2. Using the binomial expansion and the fact that w? = —1 — w, 
we find that 
w = (a? — 3a — 3)3a? + 3a). 


Note that a? + a # 0. Thus, w € Q(a), and so V2 also belongs to Q(a). This 
shows that Q(a) = Q(/2,w). Next, we find the minimum polynomial of a, and 
also the degree [Q(a@) : Q]. Note that Q(q@) is an extension of Q(x/2), and it is also 
an extension of Q(w). Since [Q(/2) : Q] = 3(X? — 2 is the minimum polynomial 
of 2) and [Q(w) : Q] = 2, it follows that 2 and 3 both divide [Q(a) : QO]. Thus 6 
divides [Q(a) : Q]. Next, 


272 8 Field Theory, Galois Theory 
3 | 2 
a — 3a — 3 = w(3a° + 3a). 


Putting the value of w from the above equation in the equation w? + w + 1 =0, 
we get that 


(a? — 3a —3)? + Ba? + 3a)? + Ba’ + 3a)(a* — 3a —3) = 0. 
This gives us a six degree monic polynomial 
(x? — 3X — 3)? + (3X? +3X)* + (3X? + 3X)(X? — 3X —3) = 0. 


of which a is a root. Since [Q(q@) : Q] is at least 6, it follows that this is the minimum 
polynomial of a, and [Q(a) : Q] = 6. 


Example 8.1.21 Let p and q be distinct prime numbers. Then ,/p and ,/q are alge- 
braic over Q with minimum polynomials X? — p and X* — q respectively. Clearly, 
QP) = {at bp | a,b € Q), and [Q(/p) : Q] = 2. Also VG ¢ QD). 
and so ,/q is algebraic over Q(,/p) with minimum polynomial X? — q. Thus, 
IVP. VO: Q = (QP. VO : AWYPIUQ YP) : Q! = 4. We show 
that Q./P, /9) = QU/p+ /@. Note that Q(/p + /q) © QU /p, /q). Put 
a = /p+ /q. Then 


(a— fp) = ao? + p — 2a,/p = gq. 


This shows that a? + p — 2a,/p belongs to Q(a). Hence ,/p € O(a). Similarly, 
/4 € Q(a). This shows that Q(a) = Q(/p, /@). Next, 


(7 + p — q)* — 4po? = 0. 


This shows that a satisfies a monic polynomial of degree 4 which is the minimum 
polynomial of a. 


Example 8.1.22 Let p be a positive prime integer. By the Eisenstein irreducibility 
criteria, X” — p is irreducible in Q[X]. Hence ./p is an algebraic element over Q of 
which X” — p is the minimum polynomial. It follows that [Q(./p) : Q] = n. 


Example 8.1.23 7m and the exponential e are transcendental (not algebraic). It was 
Hermite who proved the transcendence of e in 1873. Earlier, the irrationality of e 
was established by Liouville. The transcendence of 7 was established by Lindemann. 
The proof of this fact can be found in Algebra by S. Lang. You may also refer to 
the corollary following the theorem 8.7.5. It is not known if 7 is transcendental over 
Qe), or equivalently, it is not known if e is transcendental over Q(z). 


Proposition 8.1.24 Let K(X) be the field of fractions of the polynomial ring K[X] 
(This is also called the function field over K in one variable). Let T € K(X) — K. 
Then K(X) is an algebraic extension of K(T), and T is transcendental over K. 


8.1 Field Extensions 273 


Further, if T = a, where f (X) and g(X) are co-prime in K[X], then h(Y) = 
Tg(Y) —f() € K(X)[Y] is a minimum degree polynomial over K(T) of which X is 


a root. In turn, [K(X) : K(T)] = max(degf (X), g(X)). 


Proof PutL = K(T), where T = -, We find the minimum degree polynomial 
in K(T)[Y] of which X is a root. Consider the polynomial h(Y) = Tg(Y)—f(Y) in 
K(T)[Y]. We first observe that h(Y) 4 0 and degh(Y) = max(degf (X), degg(X)). 
If not, then the leading term of Tg(Y) and f(Y) should be same. Now, the leading 
coefficient of Tg(Y) is Ta for some a € K, and the leading coefficient of f(Y) is 
b for some b € K. This would mean that T = ba~! € K, a contradiction to the 
supposition that T ¢ K. Clearly, X is a root of h(Y), and so X is algebraic over K(T). 
This also shows that K(X) is algebraic extension of K(T). It is sufficient, therefore, 
to show that h(Y) is irreducible in K(T)[Y]. We first observe that T is transcendental 
over K (in other words every element of K(X) — K is transcendental over K). For if 
not, then T would be algebraic over K. This says that K (T) is algebraic extension of 
K. Since K(X) is already algebraic extension of K (7), this would mean that K(X) is 
algebraic extension of K, a contradiction to the fact that X is transcendental over K. 
Thus, K[T] is isomorphic to the polynomial ring over K. Since f(X) and g(X) are 
co-prime in K[X],h(Y) = Tg(Y) — f(Y) isa primitive polynomial of degree | in 
K[Y][T]. Hence it is irreducible in K[Y][T] = K[T][Y], and so also it is irreducible 
in K(T)[Y]. The result follows. tt 


Remark 8.1.25. Luroth proved in 1876 a fundamental result in the theory of algebraic 
function fields that any subfield F of L = K(X) containing K is again of the form 
K(T), where, of course, T is transcendental over K. When K is algebraically closed 
field in the sense that there is no proper algebraic extension of K, then a result of 
Castelnuovo says that every subfield of K(X, Y) containing K is of the form K(T), 
or it is of the form K(7, S). The result of this kind is not true for algebraic function 
field in 3 variables. 


Example 8.1.26 Let L = K(X) be the field of rational functions over K in one 
variable. We determine the Galois group of K (X) over K (the group of automorphisms 
of K(X) fixing the members of K). Let T € K(X) be such that K(X) = K(T). Then 
T ¢ K. Suppose that T = ae where f (X) and g(X) are co-prime in K[X]. Then 
[K(X) : K(T)] = 1, and so from the previous proposition, max(degf (X), g(X)) = 


1. Thus, T = ax+h for some a, b,c, d € K. Interchanging the role of X and T we 


observe that X = sae for some u, v, p,q € K. Thus 
aX +b aX +b 

xX I : 

Oxae a) Wxae + vu 


or equivalently, 


p(ax? + bX) + q(cX? + dX) = uaX + ub+ vcX + vd. 


274 8 Field Theory, Galois Theory 


Thus, comparing the coefficient of same powers of X, we obtain that pa+ qc = 
0 = ub+vd,and pb+ qd = ua+vc 4 0. This shows that the matrix 


ab 

cd 
is a nonsingular 2 x 2 matrix, and so it belongs to the group GL(2, K). Conversely, 
aX+ 


given such a matrix in GL(2, K), we can solve X in terms of T = axe and so 
K(X) = K(7). It follows that any element 


ab 
iw 

of GL(2, K) determines an element of G(K(X)/K) which takes X to ae This 
defines a surjective homomorphism 77 (say) from GL(2, K) to G(K(X)/K). Since 
the T = X ifand only ifa = dandb = 0 = cc, the kernel of 77 is the normal 
subgroup of GL(2, K) consisting of the scalar matrices. The subgroup of the scalar 
matrices is precisely the center of GL(2, K). The quotient group of GL(2, K) mod- 
ulo its center is called the projective general linear group, and it is denoted by 
PGL(2, K). Thus, the group G(K(X)/K) of K—automorphisms of K(X) is iso- 
morphic to PGL(2, K). 


Exercise 


8.1.1 Let K be a field, and K((X)) = {2° a,X" | a, € K,m € Z} be the set of 


n=m 


formal power series with coefficients in K. Define + by 
Se n eC n -< nN 
u AnX +h b,X = Y (an + b,)X ‘ 
n n n 
and the multiplication - by 
o.e) ii [oe 7 [o.e) n—-r fi 
CX aX") (CX DpX") = EY CD andy) X". 
n=m Aa=r n=m+r l=m 


Show that K((X)) is a field with respect to the above operation, and it is a field 
extension of K(X). 


8.1.2 Give an example of an algebraic extension which is not a finite extension. 


8.1.3 Show that Q(./5, J7) is a finite extension of Q. Find its degree. Find also 
a primitive element of the extension together with the minimum polynomial of that 
primitive element. 


8.1.4 Find the minimum polynomial of a primitive 11th root of unity over Q. 


8.1.5 Let L be a field extension of K. Let a1, a2,..., a, be elements of L which 
are algebraic over K. Show that 


8.1 Field Extensions 275 
[K(a1,Q2,...,a,): K] < [isoo: K}. 


Show, by means of an example, that the strict inequality may hold. Find a sufficient 
condition for the equality. 


8.1.6 Show that the fields Q(,/p) and Q(,/q), where p and q are distinct primes, 
are not isomorphic whereas they are isomorphic as vector spaces over Q. 


8.1.7 Let L be an algebraic extension of K. Show that any subring of L which 
contains K is a field. 


8.1.8 Let L be a finite field extension of K. Let F, and F2 be intermediary fields. 
The smallest subfield of L containing F and F> is called the composite of F, and 
Fy, and it is denoted by F F2. Show that 


[F\ FP, : K] < [FP : K]LFo; K]. 


Further, show that the equality holds provided that [F', : K] and[F2 : K] are co-prime. 
Give an example to show that the strict inequality may hold. 


8.1.9 Find the degree of Q(V/2, V7) over Q, and also find a primitive element for 
the extension. 


8.1.10 Let Q be the algebraic closure of Q in C. Show that [Q : Q] is infinite. 
8.1.11 Let m be co-prime to [K(a) : K]. Show that K(a) = K(a”). 


8.1.12 Show that the composite of two field extensions F and F? is algebraic if and 
only if both are algebraic. 


8.1.13. Show, by means of an example, that [L : K] = 3 need not mean that 
= K(./a) for some a. 


8.1.14 Show that Sinm? is algebraic for all rational m. 


8.2 Galois Extensions 


To each polynomial f(X) € K[X], Galois attached a group, called the Galois group 
of the polynomial f(X). It is essentially a group of permutations of roots of f(X) in 
certain extension field L of K over whichf (X) splits into linear factors. He showed that 
the polynomial equation f(X) = 0 can be solved using field and radical operations 
if and only if the Galois group of f (X) is a solvable group. Here, we follow a slightly 
different but equivalent approach due to Artin in which we proceed with the Galois 
group of an extension. 


276 8 Field Theory, Galois Theory 


Let L and L’ be two extensions of K. Aring homomorphism f from L to L’ such that 
f(a) = aforalla € K is called a K-homomorphism. Clearly, a K-homomorphism 
f from L to L’ is also a vector space homomorphism from L to L’ considered as vector 
spaces over K, forf(aa) = f(a)f(@) = af(a)foralla € K, anda € L. Since Lis 
afieldkerf = {0},orkerf = L.Sincef takes identity to identity, kerf = {0}. This 
shows that f is injective. If f is also a bijection, then f is called a K-isomorphism. 
Thus, if [L: K] = [L': K] < ov, then/f is an isomorphism (an injective linear 
transformation between vector spaces of same dimensions is an isomorphism). In 
particular, if LZ is a finite field extension of K, then any K-homomorphism from L to 
L is an isomorphism. This is called a K-automorphism of L. 


Definition 8.2.1 Let L be a field extension of K. Then the set of all K - automor- 
phisms of L form a group under composition of maps. This group is called the 
Galois group of the extension L of K, and it is denoted by G(L/K). 


Definition 8.2.2 Let L be a field, and X be a subset of the group Aut(L). Let F(X) 
denote the set of all elements a € L such that o(a) = aforallo € X. Then, F(X) is 
a subfield of L, and it is called the fixed field of X. It is clear that F(X) = F(< X >). 


Observe that if X C G(L/K), then F(X) is an intermediary field of the field 
extension L of K. 

Let S(G(L/K)) denote the set of all subgroups of the Galois group G(L/K) of the 
field extension L of K, and SF(L/K) denote the set of all intermediary fields. Then, 
we have a map ® from S(G(L/K)) to SF(L/K) defined by ®(H) = F(A) (the fixed 
field of H), anda map W from SF(L/K) to S(G(L/K)) defined by V(F) = G(L/F). 

The aim of this and the following two sections will be to show that in case L 
is a finite extension of K, these two maps are inverses to each other if and only if 
K = F(G(L/K)). 


Definition 8.2.3 Let L be an algebraic extension of K. We say that L is a 
Galois extension of K if K = F(G(L/K)). 


Proposition 8.2.4 Let L be a field extension of K. Then, 


. Ky © Ko, Ky, Ky € SF(L/K) => G(L/K2) © G(L/K}). 
A, © Ap, Ay, Hy © S(G(L/K)) => F(A2) C F(A). 

. SC G(L/K) => F(S) = F(<S >). 

. K' © SF(L/K) => K' C F(G(L/K’)). 

S C G(L/K) = < S >C G(L/F(« S >)). 

. F(A) = F(G(L/F(A))) for all H € S(G(L/K)). 

. G(L/K’) = G(L/F(G(L/K’))) for all K' € SF(L/K). 


NWO AWN 


Proof 1 and 2 follow from the definitions. Since S C < S$ >, if a is fixed by every 
element of < S >, then it is also fixed by every element of S. Hence F(< S >) € 
F(S). Further, suppose that a is fixed by every element of S. Then it is fixed by every 
power of elements of S, and so also by the products of powers of elements of S. Thus, 
F(S) C F(< S >). This proves 3. 4 and 5 also follow from the definitions. 


8.2 Galois Extensions 277 


Now, we prove 6. Let a € F(A). If 0 € G(L/F(A)), then by the definition 
a(a) = a. This shows that a € F(G(L/F(H))). Thus, F(H) © F(G(L/F(A))). 
Next, by 5, H C G(L/F(A)). By 2, F(G(L/F(A))) C F(A). This completes the 
proof of 6. 

Finally, we prove 7. From 4, it follows that K’ C F(G(L/K’)), and so by 1, 
G(L/F(G(L/K'))) © G(L/K’). Let o € G(L/K’). Then by the definition o will 
fix every member of F(G(L/K’), and hence it belongs to G(L/F(G(L/K’))). This 
completes the proof of 7. tt 


Example 8.2.5 G(C/R) = {Ic, a}, where o denotes the complex conjugation of 
C: Let o be a nonidentity R-automorphism of C. Leta+ib € C. Then o(a+ib) = 
a(a) + a(ijo(b) = a + ob. Since a is an automorphism —1 = o(—1) = 
a(i?) = o(i)o(i) = (o(i))?. This shows that c(i) = +i. Since a is nonidentity, 
o(i) = —i. Thus, o(a+ib) = a—ib = a+ ib. It follows thatR = F(G(C/R)), 
and so the extension C of R is a Galois extension. 


Example 8.2.6 As in the above example it can be seen that G(Q(V2) /Q = 
{o, V3» 7}, Where o(a + b/2) = a—by2. It follows that the extension O(V2) is 
also a Galois extension of Q. 


Example 8.2.7 Consider the extension Q/5) of Q, where 4/5 is the real cube root 
of 5. It is not a Galois extension: We first show that G(Q(/5) /Q = {Igy xy}: 
Clearly, Q(/5) = {a+ b53 +053 | a, b, c € Q}. The other cube roots of 5 are not 
in Q(v/5), for they are not real numbers. If o is an automorphism of Q(v/5), then 
+ = @(6) = (o(53)3). Thus, o(53) = 53. Hence a is the identity map. Thus, 
the Galois group of the extension is trivial, and F' (G(Qv5)) 4 Q. Hence it is not a 
Galois extension. 


Since any field automorphism of the field R of real numbers fixes Q, and since it 
also preserves order, it is the trivial identity automorphism. Thus, the Galois group 
of R over Q is trivial. In turn, it follows that R is not a Galois extension of Q. 


Proposition 8.2.8 Let L; and Ly be a field extensions of K. Let o be a K- 
homomorphism from L, to Ly. Let a be an element of L, which is algebraic over 
K. Then o(q) is also algebraic over K, and ming(a)(X) = ming (a(a@)(X). Fur- 
ther, o permutes the roots of ming(a)(X) if L; = Ly. 


Proof Since a is algebraic over K, there exists a nonzero polynomial f(X) € K[X] 
such that f(a~) = 0. Suppose that 


F(X) = ay + aX + aX? + ++) + aX", 
where each a; € K. Then, since o is a K-homomorphism, 


0 = o(0) = of(f(a)) = Ur yelaj)o(a)' = UP pajo(a)' = f(o(a)). 


278 8 Field Theory, Galois Theory 


Thus, o(q@) is also algebraic. Further, ming (a@)(a(@)) = 0, and so ming (a(a@))(X) 
divides ming (a)(X), and since the later is irreducible, ming(a)(X) = ming(o 


(a))(X). t 


Definition 8.2.9 Let L be a field extension of K. We say that an element a € L is 
K-conjugate to an element ( € L if there exists an element 0 € G(L/K) such that 


a(a) = PB. 


Corollary 8.2.10 Jf a is an algebraic element of L over K, then there are at most 
deg(ming (a)(X)) conjugates of a in L over K. 


Proof It is clear from the above result that if 3 is conjugate to a, then / is a root of 
ming (a) (X), and ming (a)(X) = ming (G)(X). t 


Example 8.2.11 The number of conjugates to an algebraic element may be strictly 
less than deg(minx (a)(X)): Consider the field Z,(X) of rational functions in one 
variable over the prime field Z,. Let 


L = Zy(X)(X?) = Zp(X)[X?] = {a(X) + ai(X)X? + --- +a, (X)X? | 
ai(X) € Zy(X), O<r <p}. 


Then it can be easily seen that L is a field with respect to the usual addition and 
multiplication of polynomials, and it is a field extension of Z,(X). The element X p 
is algebraic over Z,(X), for it is a root of Y’ — X in Z,(X)[Y]. By the Eisenstein 
irreducibility criteria Y’ — X is irreducible in Z,(X)[Y], and so it is the minimum 
polynomial of X>. Since ¥? — X = (Y — Xry, it follows that X° is self conjugate, 
and no other element is conjugate to it. 


Proposition 8.2.12 Let L be a field extension of K. Let a and 3 be elements of L 
which are algebraic over K. Suppose that they are K-conjugate. Then there is a 
K-isomorphism from K (a) to K (3) which takes a to (3. 


Proof It follows from the results above that if @ and (@ are conjugates, then 
ming (@)(X) = ming (3)(X). The map f(X) ~» f(a) defines a surjective homomor- 
phism from K[X] to K[a] = K(qa) whose kernel is the ideal < ming (a)(X) >. Thus, 
by the fundamental theorem of homomorphism, we have an isomorphism o from 
K[X]/ < ming (a)(X) > to K(q) such that o(f(X)+ < ming(a)(X) >) = f(a). 
Clearly, (a+ < ming(a)(X)) = aforalla € K ando(X+ < ming(a)(X)) = a. 
Similarly, we have an isomorphism 7 from K[X]/ < ming(G)(X) > = K[X]/ < 
ming (a) (X) > to K(8) such that r(f(X)+ < ming (a)(X)) = f (3). Clearly roo! 
is a K-isomorphism from K (qa) to K (3) which takes a to (3. tt 


Corollary 8.2.13 Let a be an algebraic element of L over K. Then there is a bijective 
map 7 from G(K(a@)/K) to the set of roots of ming (a) (X) defined by n(a) = a(a). 
In particualr, | G(K(a)/K) | is the number of distinct roots of ming (c)(X) which 
are in K(q). 


8.2 Galois Extensions 279 


Proof \f o is a K-automorphism of K(q), then it is uniquely determined by o(a) 
which has the same minimum polynomial as a. Conversely if @ is a root of the 
minimum polynomial of a which belongs to K(q), then it follows from the above 
proposition that there is a K-isomorphism from K(q) to K(). But, since @ € K(a), 
and [K(a@) : K] = deg(ming(a)(X)) = deg(ming(G)(X)) = [K(9) : K] it 
follows that K(@) = K(q). The result follows. tt 


Example 8.2.14 In this example, we calculate the Galois group of Q(23 sw) = 
Q23 +w) over Q: Looking at the minimum polynomial of 23 + w over Q, which we 
have already found in Example 8.1.20, we see that the other 5 roots of the minimum 
polynomial of 23 +w are 23 +w”’, 25w) +w, 23a +w?, w23 +w?, and w223 +w. 
The Galois group of this extension is, therefore, of order 6. Denote these elements by 
Q1, A2, 3, 4, 5, A6. Let o; be the automorphism which takes a, to a;. Then, oj is 
the identity automorphism. It can be checked that 05, 03, 07, 02, o@ are all identity 
maps. Thus, the Galois group of this extension is the symmetric group S3 of degree 
35 


Proposition 8.2.15 Let L be a finite field extension of K. Then the Galois group 
G(L/K) is finite. 


Proof Let {a;, a2,..., a,} be abasis of L over K. Clearly,L = K(a1, Q2,:-- , Q;), 
and any K-automorphism of L is uniquely determined by its effect on a, a2, ..., Q;. 
Since any K-automorphism of L will take an element of L to one of its conjugates, 
and since there are only finitely many conjugates to an algebraic element a (at most 
deg(minx (a)(X)) conjugates of a are there), it follows that a K-automorphism of L 
can take finitely many values on each a;. Hence G(L/K) is finite. tt 


Definition 8.2.16 Let K be a field and G a group. A group homomorphism from G 
to K* (the multiplicative group of nonzero members of K) is called a character of 
G in K. Thus, a character of G over K is just one dimensional representation of G 
over K. 


Consider the set F(G, K) of all maps from G to K. This is a vector space over K 
with respect to the point wise addition and multiplication by scalars. The dimension 
of this space is the cardinality | G | of G (verify). Let Ch(G, K) denote the set of all 
characters of G in K. Then Ch(G, K) C F(G, K). We have the following result due 
to Dedekind. 


Theorem 8.2.17 (Dedekind) Ch(G, K) is linearly independent subset of the vector 
space F(G, K). 


Proof Suppose the contrary. Then there is a finite subset of Ch(G, K) which is 
linearly dependent. Let S = {0), 02,...,0, | 0; 4 oj fori Aj} be a minimal finite 
linearly dependent subset of Ch(G, K). Then there exist aj, a2,..., @, in K not all 
zero such that 

ayo, + ara. +++ + AnOn = O--eee ; (8.2.1) 


280 8 Field Theory, Galois Theory 
where 0 in the RHS is the zero of F(G, K). This means that 
a101(g) + 202g) + +++ + Qpon(g) = O--+ (8.2.2) 

for all g € G. Indeed, all a; are nonzero, for otherwise we shall get a proper subset 
of S which is linearly dependent, a contradiction to the minimality of S. Since a; # 
02 there exists h € G such that 0 (h) 4 o2(h). Multiplying the Eq. (8.2.2) by oj (h) 
we get that 

BP oi(haioi(g) = O------ (8.2.3) 
for all g € G. Further substituting hg at the place of g in the Eq. (8.2.2), we get 

LiL aioi(hg) = 0 

for all g € G. Since each o; is a homomorphism, we have 

EP aoi(h)oi(g) = O------ (8.2.4) 
for all g € G. Subtracting the Eq. 8.2.4 from the Eq. 8.2.3, we get that 

LiLa(ai(h) — oi(h))aioi(g) = 0 


for all g € G. Putb; = (a) (A) — oj(h))a;, i => 2. Then, since o;(h) 4 o2(h) and 
each a; 4 0, it follows that b, 4 0. Also 


by02(g) + b303(g) + --- + bnon(g) = 0 
for all g € G, where by ¢ 0. This means that 
bya2 + b303 + ++: + bnon = O, 
where 0 in the RHS is the zero of F(G, K). This is a contradiction to the minimality 
of S. t 


Corollary 8.2.18 Let L be a finite field extension of K. Then | G(L/K) |< [L: K]. 


Proof We have already seen that under the hypothesis of the corollary G(L/K) is 
finite. Suppose that G(L/K) = {0}, 02,...0n} where o; € o; fori ¢ j. Suppose 
the contrary. Then [L: K] < n. Suppose that [L: K] = mand {x,,x2,...,Xm} is 
a basis of L over K. Consider the elements C,, Co,..., C, of the set L” consisting 
of row vectors given by 


Ci = (aj(x1), 0112), ..-, T1Xm))- 


Since L” is am-dimensional vector space over L andn > m, the set {C), C2, ..., Cn} 
is linearly dependent. Thus, there exist a1, @2,..., @, in L not all zero such that 


8.2 Galois Extensions 281 
ayCy + agCy + +++ + AnC, = 0, 
where 0) in the RHS is the zero of L”. This means that 
ayo (xj) + aza2(xj) + +++ + Anon(xj) = O 


for allj. Let x € L. Then, since {x), x2, ..., Xm} is abasis of Lover K,x = wie BX}. 
But then since each o; is a K automorphism, we have 


UE Agi) = Ur GUL aioi(xj)) = 0. 


Since not all a; are zero, {01, 02, ..., O,} is linearly dependent. Further each a; is a 
character of L* in L, and hence by the Dedekind theorem {01, 02, ..., 0,} should be 
linearly independent in F'(L*, L). This is a contradiction. Hence the assumption that 
m < nis false. tt 


Theorem 8.2.19 Let K be the fixed field of a finite group G of automorphisms of a 
field L. Then| G |= [L: K]. 


Proof By the definition G C G(L/K), and so by the previous theorem | G |<| 
G(L/K) |< [L: K]. Suppose that| G | < [L: K]. LetG = {0}, 00,..., on}, 
where o; ¢ 0; for i ¢ j. Then since n is supposed to be strictly less than [L : K], 
there exists a subset {x,,x2,..., Xn+1} of L which contains n + 1 elements, and 
which is linearly independent over K. Consider the subset {C), Cz, ..., Cr+ 1} of L” 
consisting of rows with n columns and entries in L, where 


Ci = (010%), 02%), --- , On Hi) 


i= 1,2,...,n+1.Since the dimension of L” over L isn, this set is linearly dependent. 
Rearranging if necessary, we may assume that {C;, C2, ..., C,,} is a minimal subset 
which is linearly dependent. Then there exist a1, a2, ..., Qj» in L not all zero such 
that 


a,C; + a2Cr Se AnCn = 0, 


where 0 in the RHS is the zero of L”. The minimality assumption implies that all a; 


are nonzero. Dividing by a; we may assume that a; = 1. Thus, we have 
wi” | QO; (xi) = 0------ (8.2.5) 
for allj, 1 <j <n, where ay = 1, and no q; is zero. Let o be an arbitrary element 


of G. Then applying o on the above equation we get that 


ve o(aiooj(x%) = 0 


282 8 Field Theory, Galois Theory 


for all j. Since multiplication by an element of the group to the elements of the group 
permutes the elements of the group, we have 


Li 10(aj)oj(%)) = O------ (8.2.6) 


for all j. Subtracting the Eq. 8.2.2 from the Eq. 8.2.1, and observing thata,; = 1, 
we see that 
wi (a; — a(aj))oj(x%j) = 0 


for all 7. It follows from the minimality assumption of m that (a; — o(aj)) = 
0 for all i. Thus, 0(@;) = a; for alli. Since o is an arbitrary element of G, each a; 
belongs to K the fixed field of G. But, then, since each a; € G, we see that 


oj (U2 Aix) = 0 
for all j. Since o; is an automorphism, we have 
De ix; = 0. 


This is a contradiction to the fact that each a; is nonzero, and {x,, X2,...,Xn+1} 1s 
linearly independent over K. ft 


Corollary 8.2.20 Under the hypothesis of the above theorem, G = G(L/K). In 
other words G(L/F(G)) = G. 


Proof Already G C G(L/K). Also from the Dedekind theorem, and the above 
theorem, we have | G(L/K) |< [L: K] =| G |. The result follows. tt 


Corollary 8.2.21 Let L be a finite field extension of K. The L is a Galois extension 
of K if and only if | G(L/K) | = [L: K]. 


Proof Suppose that L is a Galois extension of K. Then by the definition K = 
F(G(L/K)). Hence from the above theorem | G(L/K) | = [L: K]. Conversely, 
suppose that | G(L/K) |= [L: K]. LetL,; = F(G(L/K)). Then K C L,. From the 
hypothesis and from the previous theorem [L : K] =| G(L/K) |= [L£:1;] <[L: 
K]. Thus, [L: Z;] = [L: K],andsokK = Ly). tt 


Corollary 8.2.22 Let L be a field extension of K, and a be an element of L which 
is algebraic over K. Then K(a) is Galois extension of K if and only if there are 
degminx (a)(X) distinct roots of ming (a)(X) all belonging to K(q). 


Proof We have already seen that | G(K(a@)/K) | is equal to the number of distinct 
roots of minx (a)(X) which belong to K(q). Further, we know that [K(a) : K] = 
deg(minx (a) (X)). From the previous theorem, it follows that K (a) is a Galois exten- 
sion of K if and only if | G(K(a@)/K) | = [K(qa) : K]. The result follows. tt 


Example 8.2.23 Q(23,w) = Q(23 +w) is a Galois extension of Q, for | G(Q(2? + 
w)/Q | =| 53 /= 6 = [(Q23 +w): QI. 


8.2 Galois Extensions 283 


Example 8.2.24 Consider the field L = K(X,,X2,...,X,) of rational func- 
tions in n variable over the field K. It is the field of fractions of the polynomial 
ring K[X,, X2,...,X,]. If p € S, is a permutation of degree n, then we have an 
unique isomorphism from K[X,, X2,..., X,] to itself given by f(X1, X2,...,Xn) ~~ 
F (Xpays Xp2), +» Xp) which extends uniquely to an automorphism of L. Thus, S;, 
is isomorphic to a subgroup of Aut(L). Let G be a subgroup of S;,. Then from what 
we have done above, it follows that L is a Galois extension of F(G), and the Galois 
group of this extension is G. This, in particular, says that every finite group will 
appear as a Galois group of some Galois extension. 


Exercises 


8.2.1 Determine the Galois group of the following extensions. Which of them are 
Galois extensions: 


(i) Qe) over Q. 
(ii) Q(53) over Q. 
(iii) QC/P, /@) over Q, where p and q are distinct primes. 
(iv) R over Q. 
(v) C over Q. 


8.2.2 We have seen that Q(23, w) is Galois extension of Q with Galois group $3. 
Find the fixed field K of A3. Is K a Galois extension of Q? Support. Find also a fixed 
field of a subgroup of order 2 in S3. Is that a Galois extension of Q? support. 


8.2.3 Suppose that L is a Galois extension of K, and F a subfield of L containing 
K. Show that L is also a Galois extension of F. Show by means of an example that 
F need not be a Galois extension of K. 


8.2.4 Is Q(/3i) isomorphic to Q(/3)? support. 


8.2.5 Let K and L be two extensions of Q of degrees 2. Are they necessarily iso- 
morphic? Support. 


8.2.6 Find the Group of automorphisms of Z,(X). Show that it is finite. Find its 
order. 


8.2.7 Let p be the primitive 4th root of unity. Find the Galois group of Q(p)/Q. Is 
this a Galois extension? Support. 


8.2.8 Show that any field extension L of degree 2 of a field K of characteristic 
different from 2 is a Galois extension. 


8.2.9 Let L bea finite field extension of K, and L’ any extension of K. Let Xx (L, L’) 
be the set of all field homomorphisms from L to L’ which fix the members of K (in 
particular Xx (L, L) = G(L/K)). Use the Dedekind theorem and imitate the Proof 
of Corollary 8.2.18 to show that | Ux(L, L’) |< [L: K]. 


284 8 Field Theory, Galois Theory 


8.2.10 Show that a finite field extension L of K is a Galois extension if and only if 
the following two conditions hold. 


(i) There is a field extension L’ of L such that | Ug (L, L') |= [L: K]. 
(ii) Given any field homomorphism o from L to a field extension L’ of L which 
fixes every element of K,a(L) = L. 


8.2.11 Prove the following generalization of the result in Exercise 8.2.9: Let K and 
K’ be fields. Let L be a finite extension of K and L’ an extension of K’. Show that 
the number of extensions of o to homomorphisms from L to L’ is at most [L : K]. 


8.3 Splitting Field, Normal Extensions 


A finite field extension L of K may cease (see the Exercise 8.2.10 of the previous 
section) to be a Galois extension because of any of the following two reasons: 


(i) There is field extension L’ of L and a field homomorphism o from L to L’ fixing 
each element of K such that o(L) 4 L. 
(ii) Given any field extension L’ of L, | Ux(L, L') | < [L: K]. 


In this section, we study those finite extensions L of K for which (i) is not the reason. 


Definition 8.3.1 A finite extension L of K is called a normal extension if given any 
field extension L’ of L, and a field homomorphism o from L to L’ which fixes every 
element of K, o(L) = L. 


Definition 8.3.2 A finite extension L of K is called a separable extension if there 
exists an extension L’ of L such that | Ux (L, L’) | = [L: K]. 


Thus, a finite extension is a Galois extension if and only if it is separable as well 
as normal. 
Separable extensions will be the subject matter of study of the next section. 


Theorem 8.3.3 Let K be a field, and f (X) be a nonconstant polynomial of degree n 
over K. Then there exists a field extension L of K such that [L : K] <n, and f(X) 
has a root in L, 


Proof Every nonconstant polynomial in K[X] is product of irreducible polynomials 
of positive degrees in K[X], and if an element a in a field extension L of K is a root 
of an irreducible factor of f(X), then it is also a root of f(X). Thus, it is sufficient 
to prove the result for irreducible polynomials over K[X] of positive degrees. Let 
P(X) be an irreducible polynomial in K[X] of degree n. Since K[X] is a principal 
ideal domain, the ideal < p(X) > generated by p(X) is a maximal ideal. Thus, the 
quotient ring K[X]/< p(X) > is a field. Let us denote it by F. Define a map 77) from 
K toF by 7n(a) = a+ < p(X) >. Itis easily seen that 77 is a ring homomorphism. 


8.3 Splitting Field, Normal Extensions 285 


Suppose that n(a) = n(b). Thena+ < p(X) >= b+ < p(X) >. This means 
that a — b belongs to < p(X) >. Since p(X) is irreducible polynomial of positive 
degree, this is possible if and only ifa — b = 0.Thusa = Db. This shows that 
7 is injective homomorphism and so it is an embedding. Any element of F is of 
the form f(X) + < p(X) >= r(X)+ < p(X) >, where r(X) is the remainder 
obtained when we divide f (X) by p(X). It is also clear that if 7;(X) + < p(X) > = 
r2(X) + < p(X) >, and deg(7;(X)) < deg(p(X)), i= 1,2, then) (X) = 12(X). 
Let §9,(K) denote the set of all polynomials in K[X] whose degrees are less that 
n. Then from what we have proved above it follows that the map p from g9,(K) to 
K[X]/ < p(X) > defined by p(r(X)) = r(X) + < p(X) > is a bijective mapping 
whose restriction to K is 7. Pullback the operations of K[X]/ < p(X) > to those of 
§©,(K) with the help of the map p. The operations ® and « on 9,,(K), thus obtained, 
are given by 
r(X) ® s(X) = r(X) + s(X), 


and 
r(X) « s(X) = t(X), 


where f(X) is the remainder obtained when r(X) - s(X) is divided by p(X). Clearly, 
§2,(K) becomes a field such that p is an isomorphism, and K is a subfield of 9, (K). 
Further, X € 60,(K), and X is a root of p(X), for p(X) represents 0 in 99,(K). tt 


Let f(X) be a polynomial in K[X]. We say that f(X) has all its roots in a field 
extension L of K if f(X) splits into product of linear factors in L[X] in the sense that 
fQX) = a(X —a,)(X — a)...(X —a,) for some a € K and qj, a2,..., a, in L. 
We also say that f(X) splits over L. 


Corollary 8.3.4 Let f(X) € K[X] be a polynomial of degree n. Then there is a field 
extension L of degree at most n! such that f (X) has all roots in L. 


Proof The proof is by the induction on the degree of f(X). If f(X) is of degree 1, 
then it has only one root which is in K. Assume that the result is true for all those 
polynomials whose degrees are less than n. Let f(X) be a polynomial of degree n. 
From the above theorem it follows that there is an extension L of K which has a root 
a (say) of f(X), and [L : K] is at most n. Then f(X) = (X — a)g(X), where g(X) 
is a polynomial in L[X] of degree n — 1. By the induction hypothesis there is an 
extension F of L of degree at most (n — 1)! in which g(X) has all its roots. Clearly, 
f(X) has all its roots in F, and[F: K] = [F: L][L: K] is at most n!. tt 


Definition 8.3.5 Let K be a field, and f(X) be a polynomial in K[X]. A minimal 
field extension L of K such that f(X) splits over L is called a splitting field of f(X) 
over K. 


We have seen that given any polynomial f(X) € K[X], there is a field extension L 
of K such that f(X) splits over K. Let a), az,..., a, be all roots of f(X) in L. Then 
K (qj, a2,...,Q,) is a minimal field extension of K in which f(X) splits. Thus, we 
have 


286 8 Field Theory, Galois Theory 


Corollary 8.3.6 Let K be a field, and f (X) be a polynomial over K. Then f (X) has 
a splitting field over K. ft 


The above result says that a splitting field of a polynomial exists. Our next aim is 
to show that it is unique up to K-isomorphism. 


Proposition 8.3.7 Let K and K' be fields, and o be an isomorphism from K to K’. 
Let L be a field extension of K, and L' be an extension of K’. Let a be an element of 
L which is algebraic over K with minimal polynomial p(X), and b an element of L' 
which is a root of p° (X). Then there is an extension X of o to an isomorphism from 
K(a) to K'(b) such that X(a) = b. 


Proof Since a is an isomorphism, and p(X) is irreducible, p’ (X) is also irreducible 
in K’[X]. In particular, p’(X) is the minimum polynomial of b. The map 7 from 
K[X]/ < p(X) > to K’[X]/ < p(X) > defined by n(f(X)+ < p(X) >) = 
f° (X)+ < p’(X) > is an isomorphism (verify). Further, we have an isomorphism @ 
from K(a) to K[X]/ < p(X) > which fixes the members of K and takes a to X+ < 
p(X) >. Similarly, we have an isomorphism y from K’(b) to K’[X]/ < p’(X) > 
which fixes the members of K’ and takes b to X+ < p’(X) >. Itis clear that the map 
w!ono¢d has the desired property. ft 


Corollary 8.3.8 Let K and K’ be fields, and o be an isomorphism from K to K'. Let 
F(X) € K[X]. Let L be a splitting field of f (X) over K, and L' be a splitting field of 
f° (X) over K'. Then o can be extended to an isomorphism from L to L'. 


Proof The proof is by the induction on the degree of f(X). If degree of f(X) is 1, 
then there is nothing to do. Assume that the result is true for all those polynomials 
whose degrees are less than n. Let f(X) a polynomial of degree n. Suppose that f(X) 
has aroota € K.Thenf(X) = (X —a)g(X), where g(X) is a polynomial in K[X] of 
degree n — 1. Clearly, L is a splitting field of g(X), and L’ is a splitting field of g’ (X) 
over K’. By the induction hypothesis, 7 can be extended to an isomorphism from L 
to L’. Suppose that f(X) has no root in K. Let p(X) be an irreducible factor of f(X). 
Then degree of p(X) is greater than 1, and p’ (X) is an irreducible factor of f° (X). 
Let a be a root of p(X) in L, and b be a root of p? (X) in L’. From the previous result, 
a can be extended to an isomorphism o’ from K (a) to K’(b) such that o’(a) = Db. 
Now (X — a) divides f(X) over K (a), and X — b divides f? (X) over K'(b). Suppose 
that f(X) = (X —a)g(X), andf’(X) = (X — b)h(X), where g(X) € K(a)[X], and 
A(X) = g” (X) in K’(b)[X]. Clearly, L is a splitting field of g(X) over K(a), and L’ 
is the splitting field of h(X) = g” (X) over K'(b)[X]. By the induction hypothesis, 
a’ can be extended to an isomorphism from L to L’. Thus, o can be extended to an 
isomorphism from L to L’. tt 


Corollary 8.3.9 Splitting field of a polynomial f (X) in K[X] is unique upto K- 
isomorphism. 


8.3 Splitting Field, Normal Extensions 287 


Proof Takeo = Ix the identity map on K, and apply the above corollary. ft 


Corollary 8.3.10 [f L is a splitting field of a polynomial f (X) € K[X] of degree n, 
then [L: K] < nl. t 


Example 8.3.11 The equality may also hold in the above corollary. Consider the 
field L = K(X, X2,...,X,) of rational functions in n indeterminates over the field 
K. Define a map 77 from the symmetric group S,, of degree n to the group Aut(L) of 
automorphisms of L by 


f(X1,X2,..-,Xn). fF (Xpay, Xp), -- + Xp) 


np) = 
g(X, X2, 2+, Xn) 9(Xpays Xp2); 2s Xpiny) 


where p € S,. It is clear that 7 is an injective homomorphism. Let F = F(7(S,)) 
the fixed field of 7(S,,). Let us denote 7(S,,) by G. Let 51, 52, +--+ , 5, be nm elementary 
symmetric polynomials given by 


Sk = di, <ip<< i, Xi Xin a Xin, 
where summation is taken over all k-tuples of distinctelementsi; < iz <-+--- < k. 
In particular, 5} = xj) +x. +--+: +X, ands, = X1X2...X,. We show that 


F = K(s,,82,...,Sp), and L is the splitting field of the polynomial 
fae ag = Bk Sore (17s, 


over K(5, 52,..., S,). Clearly, each s, € F(G) = F,andso K(s1, 52,...,5,) C F. 
Further, since F is the fixed field of G, we have [L: F] =| G|=| S, |= n!. Next, 
we note that 

F(X) = (XK — x1) (XK — x)... XK — Xn) 


(proof follows from easy expansion). Thus, L is the splitting field of f(X) over 
K(s1, 52,..., 5,). In turn, we have 


n! = [L: F]) <[L: K(s1, ,...,8n)] < nl. 


The first inequality is true because K(s1, 52,...,5,) C F, and the second inequality 
is true because of the previous corollary. This shows that F = K(sj, 52,...,5n). ff 


Example 8.3.12 The field C of complex numbers is the splitting field the polynomial 
X* +1 over the field R of real numbers. It is also the splitting field of X* + 3 over R. 


Example 8.3.13 K = Q(23 ,w) is the splitting field of X? — 2 over Q. This is 
1 1 1 
because K is the smallest field containing all the roots 23, 23w, and 23w? of X3—2. 


288 8 Field Theory, Galois Theory 


Algebraically Closed Field and Algebraic Closure 


Definition 8.3.14 A field K is said to be an algebraically closed field if every poly- 
nomial f(X) in K[X] has all its roots in K, or equivalently, every irreducible element 
of K[X] is a linear polynomial of the type aX + b,a,be K,a #0. 


Proposition 8.3.15 The following conditions on a field K are equivalent. 


1. K is algebraically closed field. 

2. If f(X) is any polynomial in K[X], then K is the splitting field of f (X). 
3. Every nonconstant polynomial f (X) € K[X] has a root in K. 

4. There is no proper algebraic extension of K. 

5. There is no proper finite extension of K. 


Proof 1 = 2. Assume 1. Let f(X) be a polynomial in K[X]. Since K is algebraically 
closed, f(X) has all its roots in K, and so K is the splitting field of f(X). 

2 = 3. Assume 2. Let f(X) be a nonconstant polynomial in K[X]. By 2, K is the 
splitting field of f (X), and so K contains all roots of f (X). Since f (X) is nonconstant, 
it has at least one root. 

3 => 4. Assume 3. Let L be an algebraic extension of K. Let a € L. Suppose 
that a ¢ K. Then the minimum polynomial minx (a)(X) is an irreducible polynomial 
of degree greater than 1, and so it will have no root in K. This a contradiction to the 
assumption. 

4 => 5. Assume 4. Since every finite extension is an algebraic extension, there 
is no finite extension of K. 

5 => |. Assume 5. Then there is no finite extension of K. Let f(X) € K[X] bea 
polynomial of positive degree. Let L be the splitting field of f(X). Then L is a finite 
extension of degree at most n!, where n is the degree of f(X). By 5,L = K, and so 
K has all roots of f(X). By the definition, K is algebraically closed. ft 


Example 8.3.16 We shall see later that the field C of complex numbers is an alge- 
braically closed field. This is known as the fundamental theorem of algebra which 
was first proved by Gauss. 


Proposition 8.3.17 No finite field can be algebraically closed. 


Proof Let K = {a,,q,...,d,} be a finite field. Then f(X) = (X —a,)(X — 
az)...(X —a,) + Ihas no root in K. tt 


Now, we shall show that every field can be enlarged to an algebraically closed 
field. 


Definition 8.3.18 An algebraic extension L of K is called an algebraic closure of 
K if L is an algebraically closed field. Thus, a maximal algebraic extension of K, if 
exists, is called an algebraic closure of K. 


Now, we shall show that every field K has an algebraic closure and it is unique 
upto K-isomorphism. 

The following proposition is essential to escape some set theoretic logical prob- 
lems in proving the existence of algebraic closure. 


8.3 Splitting Field, Normal Extensions 289 


Proposition 8.3.19 Let L be an algebraic extension of K. If K is finite, then the 
cardinality | L | of L is at most that of the set N of natural numbers(i.e., it is finite or 
countably infinite). If K is infinite, then the cardinality | L | of L is the same as the 
cardinality | K | of K. 


Proof Let Xx denote the set of irreducible monic polynomials over the field K. To 
every member p(X) € Xx, we associate the subset Y,,x) consisting of roots of p(X) 
in L. Y,(x) may be empty set also. Clearly, Y,:x) is a finite subset of L containing at 
most n elements, where n is the degree of p(X). It is clear thatL = U p(X)eXx Vp): 
Now, each irreducible monic polynomial of degree n is determined uniquely by a 
choice of an ordered n-tuple in K. Thus, the cardinality of Xx is same as that of the 
set U <n K”. If K is finite, then since a countable union of disjoint finite sets is again 
countable, it follows, in this case, that Xx has the same cardinality as that of N. Next, 
if K is infinite, then K” has same cardinality as K. Since K is infinite, a countable 
union of sets having the same cardinality as that of K again has the same cardinality 
as that of K. Since Y,,x) is finite, the same argument shows that if K is finite, then 
the cardinality of L is at most that of N, and if K is infinite, then its cardinality is 
same as that of K. tt 


Theorem 8.3.20 Every field has an algebraic closure. 


Proof Let K bea field. Observe that there is no set containing all algebraic extensions 
of K. Indeed, we need to consider a set of algebraic extensions of K so that every 
algebraic extension of K is K-isomorphic to a member of the set. For this purpose, let 
x be a set containing K, and whose cardinality is strictly larger than that of K UN. 
This is possible because power set of a set always has larger cardinality than that of 
the set. Let Q be the set of all fields which are algebraic extensions of K, and whose 
set part is contained in X. 

Clearly, is nonempty set for K € Q. Define a partial order <in Qby LZ; < Ly, if L, 
is a subfield of Ly. Thus, (QQ, <) isa nonempty partially ordered set. Let {Z, | a € A} 
be achain in Q. Let Lp = ve a La. Then Lo is also a field contained in & of which 
all L, are subfields, and which is an algebraic extension of K. Thus, Lo € Q is an 
upper bound of the chain. By the Zorn’s lemma Q has a maximal member L (say). 
Then L is an algebraic extension of K. We show that L is an algebraic closure of 
K by showing that it is an algebraically closed field. Suppose not. Then there is 
a proper algebraic extension F of L. From the above proposition, it follows that 
the cardinality of F is strictly less than that of =. Hence there is a subset L’ of X 
containing L properly, and a bijective map 7) from L’ to F which is identity on L. We 
can pull back the operations of F to that of L’ so that L’ becomes a proper algebraic 
extension of L. Clearly, L’ is also an algebraic extension of K, and L’ € &. This 
contradicts the supposition that Z is a maximal member of Q. This completes the 
proof of the fact that L is an algebraic closure of K. ft 


Theorem 8.3.21 Let o be an isomorphism from a field K to a field kK’. Let K bean 
algebraic closure of K, and K' be an algebraic closure of K’. Then o can be extended 
to an isomorphism from K to K’. 


290 8 Field Theory, Galois Theory 


Proof Let Q be the set of triples (L, 0’, L’), where L is a subfield of K containing 
K, L’ isasubfield of K’ containing K’, and o’ is an extension of o to an isomorphism 
from L to L’. The set Q is a nonempty set, for (K, a, K’) is a member of Q. Define 
a relation < on Q by 


(Li, 01, L}) < (Lo, 02, L5) == Li C ly, L, © L} and o2 an extension of 04. 


Thus, ({2, <) is a nonempty partially ordered set. Let {(La, 7, L’,) | a € A} bea 
chain in Q. Let Lp = Unen La, Lo = Unea Li, and oo is a map from Lo to Lh 
defined by the property that oo restricted to L, is oq. Then it is easy to observe that 
the triple (Lo, a9, Lo) is a member of Q which is an upper bound of the chain. By 
the Zorn’s lemma Q has a maximal member (LZ, @, L’) (say). We show that L = K 
and L’ = K’ which will complete the proof the theorem. Suppose that a € K — L. 
Then a is algebraic over L. Let p(X) be the minimum polynomial of a over L. Since 
a €¢ L, p(X) is an irreducible polynomial of degree greater than 1. Since & is an 
isomorphism, it follows that p(X) is an irreducible polynomial over L’ of degree 
greater than 1. Since K’ is algebraically closed there is a root b in K’ of p’(X). By 
the Proposition 8.3.7, we get an extension T of & to an isomorphism from L(a) to 
L'(b). Thus, the triple (L(a), 7, L’(b)) belongs t to Q, and this is a contradiction to 
the maximality of the triple (L, a, L’). Hence L = K. But, then 7(K) is also an 
algebraically closed field of which K’ is an ey extension. This means that 
o(K) = K’. This completes the proof. tt 


Taking K = K’,ando = Ix in the above theorem, we get the following 
corollary: 


Corollary 8.3.22 Algebraic closure of K is unique upto K-isomorphism. tt 
The algebraic closure of K will usually be denoted by K. 


Corollary 8.3.23 Let L be an algebraic extension of K. Then the algebraic closure 
L of L is K-isomorphic to the algebraic closure K of K. 


Proof Follows from the fact that the algebraic closure L of L is also an algebraic 
extension of K which is algebraically closed. tt 


Corollary 8.3.24 Let K be a field and S a set of polynomials over K. Then there is 
a field extension L of K in which all the polynomials in S has a root. Further, given 
any two field extensions K, and Ky of K such that all the polynomials in S split over 
K, as well as over Ko, let Ly be the subfield of K, generated by K and the roots of 
the members of S belonging to K,, and Ly be the subfield of Ky generated by K and 
the roots of the members of S belonging to Kx. Then L, and Ly are K-isomorphic. 


Proof If K is the algebraic closure of K, then all roots of S split over K. Let L; and Ly 
be as in the hypothesis of the corollary. Then Z; and Lz are both algebraic extension 
of K. Let L; be algebraic closure of L;, and Ly be the algebraic closure of Ly. Then 
both of them are algebraic closure of K also. Hence there exists a K-isomorphism o 
from L) to Ly. Clearly o will take the roots of a polynomial in S$ to a root of the same 
polynomial in S. Thus, o restricted to ZL; will be an isomorphism from L; toL,. 


8.3 Splitting Field, Normal Extensions 291 


Definition 8.3.25 The unique (upto K-isomorphism) field described in the above 
corollary is called the splitting field of the set S of polynomials over K. It is in fact 
the smallest field upto injective embeddings over which all polynomials in S split. 


Remark 8.3.26 The algebraic closure of K is the splitting field of the set of all 
polynomials over K. 


Theorem 8.3.27 Let L be an algebraic extension of K. Then the following conditions 
are equivalent. 


1. Lis splitting field of a family S of polynomials over K. 

2. Ifo is a K-homomorphism from L to its algebraic closure L, then o(L) = L. 

3. Every member oa € G(L/K) restricted to L is a member of G(L/K). 

4. If f (X) is an irreducible polynomial in K[X] which has a root in L, then it has all 
its roots in L. 


Proof | => 2. Assume 1. Then by the definition of the splitting field of a family of 
polynomials, it follows that L is the subfield of L generated by K and the roots of the 
members of S. Since any K-homomorphism from L to L takes root of a polynomial 
in S to a root of the same polynomial, it follows that the o takes the roots of members 
of S to the roots of members of S, and it also takes K to K. Hence o(L) = L. 

2 => 3. Assume 2. If o € G(L/K), then o restricted to L is a K-homomorphism 
from L to L. By 2, o(L) = L, and so a restricted to L is a member of G(L/K). 

3 => 4. Assume 3. Let f(X) be an irreducible polynomial in K[X] which has a 
root a € L. Let b be another root of f(X) in L. Then by Proposition 8.3.7, there is 
K-isomorphism o from K(a) to K(b) such that o(a) = b. Clearly, Lisan algebraic 
closure of K (a) as well as of K(b). By the proposition 8.3.21 there is an isomorphism 
T which extends o. From 3, 7(L) = L. Hence b € L. 

4 => 1. Assume 4. Let S be the set of all polynomials in K[X] having a root in 
L. Then from 4, all the roots of members of S are in L. Further, since L is algebraic 
over K, every element of L is a root of some polynomial in K[X]. It follows that L is 
the splitting field of S over K. tt 


Definition 8.3.28 An algebraic extension L of K is called a normal extension if it 
satisfies any one (and hence all) of the above conditions. 


Corollary 8.3.29 A finite extension L of K is a normal extension if and only if L is 
splitting field of a polynomial over K. 


Proof If L is splitting field of a polynomial over K, then by the definition, L is a 
normal extension of K. Conversely, suppose that L is a finite normal extension. Then 
L = K(q,q@,...,4,) for some a), a2,..., a, in L. Let f;(X) be the minimum 
polynomial of a;. Let f (X) be the product of fi (X), fa(X), ..., f,(X). Then clearly ZL 
is splitting field of f(X) (note that by the definition L has all roots of f(X)). tt 


292 8 Field Theory, Galois Theory 


Corollary 8.3.30 Every Galois extension is a normal extension. 
Proof Follows from the Exercise 8.2.10. tt 


Remark 8.3.31 A normal extension need not be a Galois extension: Let K = Z,(X) 
the field of rational functions over the field Z, in one variable. Consider the polyno- 
mial Y? — X in K[Y]. Let L be the splitting polynomial of Y? — X over K. Then 
from the definition, Z is a normal extension of K. Y’? — X is irreducible K[Y] by 
the Eisenstein irreducibility criteria (for X is a prime element in Z,[X] of which K 
is the field of fractions). Let a be a root of this polynomial in L. Then a’? = X. 
Thus, YP — X = YP — a? = (Y — a)?. Hence ais the only root of YY — X 
which is denoted by X >. This means that L = K (Xx ). If o is any automorphism 
in G(L/K), then it permutes the roots of YY — X. Hence o(X?) = X?. This 
shows that o is the identity map. Thus, G(L/K) is the trivial group, but the degree 
[L: K] = deg(Y? — X) = p. Hence this extension is not a Galois extension. 


Corollary 8.3.32 Let L be anormal extension of K. Let F be any intermediary field. 
Then L is also a normal extension of F whereas F need not be a normal extension 
of K. 


Proof Let L be a normal extension of K and F an intermediary field. Then L is the 
splitting field of a family S of polynomials over K. Since every polynomial over K is 
also a polynomial over F, L is also the splitting field of the same set S of polynomials 
over F’. This shows that L is a normal extension of F. Next, we show that F need 
not be a normal extension of K. Consider the extension Q(23 ,w) of Q. This is the 
splitting field over Q of the polynomial X* — 2, and so it is a normal (in fact it 
is already seen to be a Galois extension). Further, Q(23) is an intermediary field 
which is not normal. Indeed, X? — 2 has a root in O(23) but not all its roots are 


in Q(23). tt 


Remark 8.3.33 \f Lis anormal extension of F,, and F is anormal extension of K, then 
L need not be a normal extension of K. For example, Q(/2) is a normal extension 
(splitting field of X* — 2) of Q, and aw V2) is anormal extension of Q(/2) whereas 
Q//2) is not anormal extension of Q. Clearly, X*+ — 2 is the minimum polynomial 
of «ff over Q where as not all roots of X* — 2 are in OG) 4/2), For example, 


Vv V2i is also a root of the polynomial X* — 2, and it is not in this field (Find the 
splitting field of this polynomial over Q). 


Corollary 8.3.34 Let F be a finite extension of K. Then there is a smallest subfield 
L of F which contains F, and which is a normal extension of K. 


Proof Suppose that F = K(aj, a,...,a,). Let f;(X) be minimum polynomial of 
a; over K. Let f (X) be the product of these polynomials, and L C F be the splitting 
field of f (X) over F’. L is also the splitting field of f(X) over K, and it is the smallest 
normal extension of K containing F’. ft 


8.3 Splitting Field, Normal Extensions 293 


Definition 8.3.35 The field L described in the above corollary is called the 
normal closure of F over K. 


Proposition 8.3.36 Let F be a finite extension of K of degree m, and f (X) be an 
irreducible polynomial in K[X] of degree n. Suppose that m and n are co-prime. Then 
Ff (&) is also irreducible in F[X). 


Proof Let L be the splitting field of the polynomial f(X) over F. We may assume 
thatn > 1. Now, f(X) can not have any of its roots in F, for if a is a root of 
F(X) which is in F, then K(a) C F. But, thenn = degf(X) = [K(a): K] 
will divide [F : K] = m. This is a contradiction to the supposition. Let a be a 
root of f(X) in L, which as shown above, is not in F. Consider F(a). Then, since 
K(a) C F(a), [K(a) : K], and [F : K] both divide [F(a) : K] = [F(a): F][F: K]. 
Thus, m - n divides [F(a) : F]- m. This shows that n divides [F (a) : F']. Since a is 
a root of f(X), it follows that [F(a) : F] < n. Hence [F(a): F] = n = degf(X), 
and so f(X) is irreducible over F. tt 


As an application of the above proposition we have the following example: 


Example 8.3.37 The polynomial f(X) = X7 — 10X* + 5X? + 15X 4+ 10is 
irreducible over Q(23 , w): By the Eisenstein irreducibility criteria, f (X) is irreducible 
over Q. Next, we have already seen that Q(23, w) is a Galois extension of degree 6 
which is co-prime to 7. The assertion follows from the above proposition. 


Exercises 


8.3.1 Show that Q(w) is the splitting field of X* + X* + 1 over Q, where w is a 
primitive cube root of unity. 


8.3.2 Show that the splitting field of X” — 1 is Q(pn), where py = ev is the 
primitive nth root of unity. Show that, in casen = pa prime, its degree is p — 1. 
More generally, we shall show that its degree is #(). 


8.3.3, Determine the degrees of the splitting fields over Q of the following polyno- 
mials: 


(i) X*+1. 
(ii) X°+X+4+1. 
(iii) X° + XP 41. 
(iv) X45. 
8.3.4 Determine the degrees of the splitting fields of the following polynomials: 


(i) X37 +X +4+ 1 over Zs. 
(ii) X7 — 5 over Zy). 


8.3.5 Give examples of rational numbers r and s such that the splitting field of 
xX? +7rX +s has degree 3 over Q. Can we characterise such r and s? 


294 8 Field Theory, Galois Theory 


8.3.6 Use the fact that if o is a K-automorphism of a field extension L of K, and 
f(&) 1s a polynomial in K[X], then o takes the roots of f(X) to that of f(X), to show 
the following: 


(i) If zis acomplex number which is a root of a polynomial f (X) with real coeffi- 
cients, then the conjugate Z is also a root of f (X). 

(ii) If ris arational number which is not a square of a rational number, and a+ b./r 
is a root of a polynomial f (X) in Q[X], then a — b,/F is also a root of f (X). 


8.3.7 Show that K = Q(/2+ V3) is a normal extension of Q of degree 4. Show 
that every irreducible polynomial of odd degree over Q is also irreducible over K. 


8.3.8 Let K be a field of characteristic p. Show that 


(i) X? — a, where a € K is either irreducible, or it has all its roots in K. 
(ii) X? — X — ais irreducible, or it has all its roots in K. Deduce that if a € 0, then 
it is irreducible over Z,. 


8.4 Separable Extensions 


Now, we describe the finite extensions L of K for which the number of K embeddings 
of LintoL = K is[L: K]. 

Consider the case when L = K(a). Let p(X) be the minimum polynomial of 
a. We know that [K(a) : K] = degp(X). Since any K-embedding of L will take 
a to a root of p(X), there are at the most as many K-embeddings of K (a) into K 
as many distinct roots of p(X). Further, given any root b of p(X), there is a unique 
Kisomorphism o from K(a) to K(b) C K which takes a to b. This shows that there 
are exactly as many K-embeddings of K(a) into K as many distinct roots of the 
minimum polynomial p(X) of a. Thus, we have proved the following: 


Proposition 8.4.1 The number of K embeddings of K(a) into K is [K(a) : K] if and 
only if all the roots of p(X) are distinct. tt 


The above proposition motivates to have the following definition. 


Definition 8.4.2 An irreducible polynomial p(X) in K[X] is called a separable 
polynomial if all its roots are distinct in its splitting field. A polynomial f(X) (not 
necessarily irreducible) is said to be separable if all its irreducible factors are sep- 
arable. An algebraic element a of an extension field L of K is said to be separable 
if the minimum polynomial of a is separable. An algebraic extension L of K is said 
to be a separable extension if every element of L is separable over K. An algebraic 
extension L of K which is not separable is said to be an inseparable extension. 


Proposition 8.4.3. Let L be a separable extension of K and F be an intermediary 
field. Then L is a separable extension of F, and F is also a separable extension of K. 


8.4 Separable Extensions 295 


Proof Suppose that L is a separable extension of K. Then every element of L is 
separable over K. In particular every element of F is separable over K. This shows 
that F is separable over K. Further, if a is an element of L, and p(X) is the minimum 
polynomial of a over K, then the minimum polynomial of a over F is a factor of 
p(X). Since p(X) has all its roots distinct, any factor of p(X) will also have its roots 
distinct. Thus, a is separable over F also. tt 


Definition 8.4.4 Let L be a finite extension of K. Let [L : K], denote the num- 
ber of distinct K-embeddings of L into L. The number [L : K], is called the 
degree of separability of the extension L of K (the justification for this terminology 
will be clear a little later). 


Thus, the above proposition says that a € L is separable if and only if [K(qa) : 
K], = [K(a) : K]. We shall show that L is a separable extension of K if and only if 
[L: K], = [L: K]. 


Proposition 8.4.5 Let L be a finite extension of K. Let o be an isomorphism from K 
to K'. Then the number of extensions of o to an embedding of L to K' is [L: K]s. 


Proof Let K be an algebraic closure of K containing L. The it follows by 8.3.21 that 
a can be extended to an isomorphism o from K to K’. Let =, (L, K’) be the set of 
extensions of ¢ to homomorphisms from L to K’. We have to show that | ©,(L, K’) | 
= [L: K],. Define a map 7 from ©, (L, K’) to X« (L, K) by n(g) = aT ‘og. Now, 
n(g1) = (gz) implies that 7-'og; = ‘ogo. Since @ is bijective, g, = go. 
Thus, 77 is injective. Leth € Ux(L, K). Then g = dohe X,(L, K’) and n(g) = Ah. 
This shows that 7) is surjective. Thus | ©,(L, K’) | =| Ux(L, K) |. By the definition 
| Ue (L, K) | = [L: K],. The result follows. tt 


Corollary 8.4.6 Let L be a finite field extension of K, and let F be an intermediary 
field. Then|L: K]; = [L: F],-[F: Ks. 


Proof Let K be an algebraic closure of K containing L. Then from the above propo- 
sition it follows that every K-embedding of F into K has exactly [L : F], extensions 
to embeddings of L into K. Further, since there are [F : K],, K -embeddings of F 
into K, it follows that there are [L : F], - [F : K], K-embeddings of L into K. The 
result follows from the definition. tt 


Corollary 8.4.7 Let L be a finite extension of K. Then L is a separable extension of 
K if and only if[L: K]; = [L: K]. 


Proof The proof is by the induction on [L: K]. If [ZL : K] = 1,thenL = K, 
and then there is nothing to do. Suppose that the result is true for all extensions 
whose degrees are less than n. Let L be a separable extension of K of degree n. 
Then every element of L is separable over K. Let a € L — K. Then by the definition 
a is separable over K. Let K be an algebraic closure of K containing L. We have 
already seen that the number of K-embeddings of K (a) in K is the number of distinct 
roots of the minimum polynomial p(X) of a, and it is the same as deg(p(X)) = 


296 8 Field Theory, Galois Theory 


[K(a) : K]. Thus, [K(a) : K]; = [K(a) : K]. We have also observed that L 
is separable over any intermediary field, and so L is separable over K(a). By the 
induction assumption, [L : K(a)], = [L: K(a)]. From the previous corollary, we 


have [L: K], = [L: K(qa)];-[K(a): K]; = [L: K(a)]-[K(a@): K] = [L: K]. 
Conversely, suppose that [L : K]; = [L: K]. Leta € L. We have to show that a is 
separable over K. It is sufficient to show that [K(a) : K]; = [K(a) : K]. Suppose 
that [K(a) : K]; < [K(a): K]. Then since [L : K(a)], < [L : K(a)], we see 
that[L : K], = [L: K(a)];-[K(qa) : K], is strictly less than [L : K]. This is a 
contradiction to the hypothesis. ft 


Corollary 8.4.8 Let L be finite separable extension of F and F a finite separable 
extension of K. Then L is a separable extension of K. 


Proof Under the hypothesis of the corollary, [L: K],; = [L: F],;-[F:K],; = [L: 
F)-(P eK] = [674]; t 


Corollary 8.4.9 Let L be a finite extension of K. Let K" denote the set of all elements 
of L which are separable over K. Then K¥ is a subfield of L. 


Proof Leta,b € Ke. Then we have already seen that K(a) and K(b) are separable 
extensions of K. Since b is separable over K, it follows from the previous result 
that K(a)(b) = K(a, b) is a separable extension of K. Thus K(a, b) C Ke Hence 
a—b,a-banda~',a # Oare in K®. This shows that K* is a subfield of L. t 


Definition 8.4.10 The subfield K” is called the separable closure of K in L. The 
separable closure of K in its algebraic closure is called the separable closure of K 


Corollary 8.4.11 A finite extension L of K is a Galois extension if and only if it is 
separable as well as normal. 


Proof We know that a finite extension L of K is a Galois extension if and only if 
| G(L/K) |= [L: K]. Suppose that LZ is a finite Galois extension. Since each 
member of G(L/K) can be viewed as K-embedding of L into K, there are at least 
[L : K] K-embeddings of L into K. There can not be more. Hence [L : K], = [L: K]. 
This shows that L is a separable extension of K. We have already seen that a Galois 
extension of K is also a normal extension. Conversely, suppose that L is a separable 
normal extension. Since it is separable, there are [L : K] K-embeddings of L into 
K. Since L is also a normal extension of K any K-embedding of L into K takes L to 
itself. This shows that | G(L/K) | = [L: K], and so L is a Galois extension of K.# 


Corollary 8.4.12 A finite field extension L of K is Galois extension if and only if it 
is splitting field of a separable polynomial over K. 


Proof A finite extension is a normal extension if and only if it is splitting field of a 
polynomial. Thus, it is sufficient to show that the splitting field L of a polynomial 
f(X) over K is a separable extension if and only if f(X) is a separable polyno- 
mial. Suppose that L is splitting field of a separable polynomial f(X) € K[X]. Let 


8.4 Separable Extensions 297 


L = K(q,a@2,...,@n,), where aj, d2,..., dy, are all roots of f(X). Then minimum 
polynomials of each q; is a divisor of f (X). Since factor of a separable polynomial is 
separable, it follows that each a; is separable. This shows that the separable closure 
of K in L is K(aj, a2,...,@,) = L. In other words L is a separable extension 
of K. Conversely, suppose that L = K(adj,a,...,d,) iS a splitting field of f(X) 
where a), a2, ..., d, are roots of f (X), and which is a separable extension of K. Then 
nonconstant irreducible factors of f(X) are precisely the minimum polynomials of 
a1, 49, ..., A, which are all separable elements. Hence f(X) is separable. tt 


Corollary 8.4.13 A finite extension F of K is a separable extension if and only if 
there is a Galois extension L of K such that F is contained in L. 


Proof Since F is a finite extension, F = K(dj,d2,...,Q,), and since it is also 
separable, the minimum polynomial f;(X) of a; is separable for each i. Let f(X) = 
fiQ)fr(X) ...fr(X). Then f (X) is separable. Let L be the splitting field of f (X) which 
contains F’. Then from the above corollary, it follows that L is a Galois extension 
of K. tt 


Our next aim is to have a test for the separability of a polynomial. We first define 
the concept of formal derivative of a polynomial. 


Definition 8.4.14 The formal derivative f’(X) of a polynomial 
F(X) = ay + aX + aX? +--+ + a,X" 
is defined to be the polynomial 
ff (X%) = a + 2aX + 9agX* sen ng, X?*, 


The proof of the following proposition is straightforward and it is left as an exercise. 


Proposition 8.4.15 Let f(X), g(X) be polynomials in K[X] and a, b € K. Then 


(i) (af + bg)'(X) = af'(X) + bg’). 

(ii) fF -g(X) = f'XgX) + f&9'®. 

(iii) (fog) (X) = f(gX))g'X). 

(iv) fX+a) = f@ + X'@ + Ff'@ +--+ TH. t 


Proposition 8.4.16 Let f(X) be a nonconstant polynomial in K[X]. Then a is a 
multiple root of f (X) in a splitting field L of f (X) over K if and only if it is a common 
root of f (X) and f'(X). 


Proof If f (X) = (X — a)g(X), then f’(X) =(X — a)g’(X) + g(X). Thus, a is a root of 
g(X) if and only if a is a common root of f (X) and f’(X). t 


298 8 Field Theory, Galois Theory 


Proposition 8.4.17 Let f(X) and g(X) be polynomials in K[X]. Let L be a field 
extension of K. Then f(X) and g(X) are co-prime in K[X] if and only if they are 
co-prime in L[X]. 


Proof Suppose that f(X) and g(X) are co-prime in K[X]. Then by the Euclidean 
algorithm there are polynomials u(X) and v(X) in K[X] such that u(X)f(X) + 
u(X)g(X) = 1. Since a polynomial in K[X] are also polynomials in L[X], this 
is an identity in L[X] also. This shows that they are co-prime in L[X]. Conversely, 
suppose that they are co-prime in L[X]. Then again by the Euclidean algorithm there 
are polynomials h(X) and k(X) in L[X] such that h(X)f(X) + k(X)g(X) = 1. If 
d(X) in K[X] divides f(X) as well as g(X) in K[X], then they divide f(X) and g(X) 
in L[X] also. Hence d(X) divides 1 in L[X]. Hence d(X) is a unit. This means that 
F(X) and g(X) are co-prime in K[X] also. tt 


Proposition 8.4.18 Let f(X) € K[X] be a nonconstant polynomial. Let L be an 
splitting field of f (X) over K. Then all the roots of f (X) in L are distinct if and only 
if f (X) and f'(X) are co prime in K[X]. 


Proof Suppose that f (X) and f’(X) are co-prime in K[X]. Suppose that a is a multiple 
root of f(X). Then f(X) = (X -— a)*g(X) for some g(X) € L[X]. Now f’(X) = 
2(X —a)g(X) + (X —a)*q'(X). This shows that X — a divides f (X) as well as f’(X). 
Hence f(X) and f’(X) are not co-prime in L[X]. From the above proposition, f (X) 
and f’(X) are not co-prime in K[X]. Conversely, suppose that all roots of f(X) are 
distinct in L. Let L’ be the splitting field of f’(X) over L. Suppose that f (X) and f’(X) 
have a common root a in L’, and so also in L (note that all roots of f (X) are supposed 
to be in L). Suppose that f(X) = (X —a)g(X). Then f’(X) = g(X) + (X—a)q'(X). 
Since a is also a root of f’(X), it follows that (X — a) divides g(X). But, then (X — a) 
divides f (X). This is a contradiction to the supposition that f (X) has no multiple root. 
Hence f(X) and f’(X) have no common root in L’. This also shows that f(X) and 
f'(X) are co-prime in L’[X], and so (by the above proposition) they are also co-prime 
in K[X]. tt 


Corollary 8.4.19 Let f(X) be an irreducible polynomial in K(X]. Then f(X) is 
separable (i.e., all its roots distinct) if and only if f (X) does not divide f'(X). 


Proof Since f (X) is irreducible greatest common divisor (f(X), f’(X)) is a unit or it 
is an associate of f(X). By the definition f(X) is separable if and only if it has no 
multiple roots. From the above proposition, this is equivalent to say that (f (X), f’(X)) 
is a unit. This in turn is equivalent to say that that f(X) does not divide f’(X). tt 


Corollary 8.4.20 An irreducible polynomial f (X) in K[X] is separable if and only 
iff'(X) #0. 


Proof If f'(X) 4 0, then it is a polynomial of lower degree than f(X), and so f(X) 
can not divide f’(X). The result follows from the above proposition. tt 


8.4 Separable Extensions 299 


Since in a field of characteristic 0, f’(X) = 0 if and only if f(X) is a constant 
polynomial (verify), the following corollary is immediate. 


Corollary 8.4.21 Every polynomial over a field K of characteristic 0 is 
separable. ft 


Corollary 8.4.22 Let K be afield of characteristic 0. Then every algebraic extension 
L of K is separable. 


Proof By the definition LZ is separable over K if and only if all elements of LZ are 
separable over K. This, in turn, means that the minimum polynomial of each element 
of L over K is separable. The result follows. ft 


Corollary 8.4.23 Let K be a field of characteristic p € 0. Let f (X) be an irreducible 
polynomial in K[X]. Then f(X) is separable if and only if there is no polynomial 
g(X) € K[X] such that f (X) = g(X?). 


Proof We have seen that an irreducible polynomial f(X) is separable if and only if 
f'(&) #0. Let K be a field of characteristic p 4 0. Let 


F(X) = ay + aX + aX? + +--+ + aX" 
be an irreducible polynomial in K[X]. Then 
R= a + tak + 3a foe 4 Gk =O 


if and only if ia; = 0 for all i. This means that if p does not divide i, aj = 0. This 
shows that f’(X) = 0 if and only if f(X) is of the form 


F(X) = ao + ayX? + AyyX?P + es + AypX'? 
for some r. This is equivalent to say that f(X) = g(X?), where 
G(X) = ay + ayX + AyyX* + +--+ + AyX". 


The result follows. tt 


Proposition 8.4.24 Let K be a field of characteristic p # 0. Let a € K. Then the 
polynomial X? — ais irreducible over K if and only if it has no root in K. 


Proof If X? — ais irreducible, then obviously it has no roots. Conversely, suppose 
that it has no root in K. Let L be a splitting field of this polynomial, and let b € L be 
aroot of X’ — a.Thenb ¢ K and b? = a. Further, 


(Mas (Ph) =k = by, 


Now, any nonunit factor of X’ — ain K[X] will be of the form (X — )D)’ for 
some r,1 < r < p. Suppose that (X — b)' € K[X]. Then1 < tforb ¢ K. 


300 8 Field Theory, Galois Theory 


Suppose thatt < p. Then since (X — b)! € K[X], it follows that tbh € K. Since t is 
co-prime to p, b € K. This is a contradiction to the supposition. Hence, in this case, 
it is irreducible. t 


Corollary 8.4.25 Let K be a field of characteristic p #4 0. Then every algebraic 
extension of K is separable over K if and only if K? = {a?|aeK} = K. 


Proof To say that every algebraic extension of K is separable and is equivalent to 
say that every polynomial over K is separable. This, in turn, is equivalent to say that 
every irreducible polynomial over K is separable. Suppose that every irreducible 
polynomial over K is separable. Let a be a nonzero element of K. Consider the 
polynomial X? — a. If itis irreducible, then since its derivative is 0, it is not separable. 
Hence X” — ais not irreducible. From the previous proposition, it follows that X”? — a 
has aroot b € K. This shows thata = b? € K?. Conversely, suppose that K? = K. 
We show that every irreducible polynomial over K is separable. Suppose contrary. 
Let f (X) be an irreducible polynomial in K[X] which is not separable. Then we have 
fQ) = g(X?) for some polynomial g(X) € K[X]. Suppose that 


g(X) = ag + aX + GX ace ae", 
Since K? = K, we have b; € K such that bP = a; for all i. Hence 
F(X) = G(X?) = (bp + BX + byX? +--+ + dyX"P 


This contradicts the supposition that f(X) is irreducible. tt 


Definition 8.4.26 A field K is said to perfect if every algebraic extension of K is 
separable. 


The above results can be restated in the light of the above definition: 


Corollary 8.4.27 Every field of characteristic 0 is a perfect field. A field K of char- 
acteristic p & 0 is perfect if and only if K? = K. tt 


Corollary 8.4.28 Every finite field is perfect. 


Proof Let K be a finite field of characteristic p. Consider the map o from K to 
K defined by o(a) = a’. Then o(a) = o(b) implies that a? = D?. Now, 
(a—b)P = a? —b? = 0. Since K isa field, a—b = 0. Thus, a is injective. Since 
K is finite, it is surjective. This means that K’? = K. From the previous result, K is 
perfect. ft 


We know that the order of every finite field is p” for some prime p andn > 1. 


Theorem 8.4.29 Given any prime p and n > 1, there is one and only one (up to 
isomorphism) field of order p". 


8.4 Separable Extensions 301 


Proof Consider the field Z, of residue classes modulo p. Consider the polynomial 
x?" — X in Z,[X]. Since its derivative is —1 4 0, all its roots in the splitting field L of 
this polynomial are distinct. We show that L is precisely the set of roots of X”" — X. 
It is sufficient to show that the set of roots of X?" — X form a field. Let a and b be 
roots of this polynomial. Then a” = aandb?" = b. Now, 


ne aa P= eee: 
Gor = fH = ab 


and if a € 0, then 
(a ly" = (a)! _ a. 


This shows that a+b, ab anda“! are all roots of the above polynomial. This completes 
the existence of a field of order p”. For uniqueness, let L; and Ly be fields of order 
p". Then both are splitting fields of X’" — X over their respective prime subfields. 
Since their prime subfields are isomorphic (to Z,), it follows from Corollary 8.3.9 
that L; and Lz are isomorphic. tt 


Corollary 8.4.30 Every finite extension of a finite field is a Galois extension. If K is 
a field containing q elements, and L is a field extension of degree n, then the Galois 
group G(L/K) is a cyclic group of order n generated by o, where o is defined by 
a(a) = a’. 


Proof Clearly, L is the splitting field of X7 — X over K. Hence L is normal as well 
as separable extension of K. By the Corollary 8.4.11, it is Galois extension. Further, 
since a? = a foralla € K, it is clear that the map o defined by o(a) = a? is 
a member of G(L/K). Also o"(a) = al = aforalla € L,andsoo"” = I,. 
Ifm <_n, then o” can not be the identity map, for other wise every element of 
L would turn out to be a root of X7" — X. This shows that o generates a cyclic 
subgroup of G of order n. Since L is Galois over K, | G(L/K) | = [L: K] = n. 
Thus, o generates G(L/K). tt 


Remark 8.4.31 Consider the algebraic closure L of Z,. Let K be a subfield of L order 
p", and F a subfield of order p”. Then K C F if and only if n divides m. This is 
clear, for if K C F, then F is a vector space over K of dimension r (say), and then F 
should contain exactly (p”)" = p”’ elements. This means that m = nr. Conversely, 
ifm = nr, then X?" — X divides X”?" — X. In fact, 


nr 


xe" _x= xe" ~ x 4 xe" 4 yp bs, oad ce yee ). 


This shows that the splitting field K of X”" — X is contained in the splitting field F 
of XP" — xX. 


Proposition 8.4.32 Let K be a finite field containing q = p" elements. Let f (X) 
be a monic irreducible polynomial of degree n over K. Let a be a root of f (X) ina 


302 8 Field Theory, Galois Theory 


field extension L of K. Then K(a) is the splitting field of f(X), and all roots of f (X) 
are of the forma’, r > 1. 


Proof Clearly, [K(a):K] = degf(X) = n. Thus, K(a) isa field extension of K of 
degree n. From the above results, K (a) is a Galois extension, and it is splitting field 
of X%" — X over K. Thus, all the roots of f(X) lie in K(a). Further, G(K(a)/K) is 
the cyclic group of order n generated by o where a is defined by o(b) = b‘. This 
shows that {o"(a) | 1 <r<n} = {a! | 1 <r <n} is the set of all distinct roots 


of f (X). t 


These results say that to find a field K of order p”, it is sufficient to find irreducible 
polynomials of degree n over Z,, and then K is simply isomorphic to the field 
Zp[X]/ < f(X) >, where f(X) is an irreducible polynomial over Z, of degree n. 
There is an effective procedure using the division algorithm in Z,[X] to enumerate 
the elements of K, and also to determine the addition and multiplications in K. How 
to determine irreducible polynomials of degree n in Z,[X]? As observed if f(X) is 
a monic irreducible polynomial over Z, of degree n, then the splitting field of f(X) 
is same as the splitting field of X?" — X. If ais a root of f(X), then it is also a root 
of X”" — X. Since f(X) is the minimum polynomial of a, it divides X?" — X. This 
shows that all irreducible monic polynomials of degree n are the irreducible factors 
of X?" — X. Further, let g(X) be any monic irreducible polynomial over Zp[X] of 
degree m, where m divides n. Then the splitting field of g(X) is also the splitting 
field of X?" — X, and it is contained in the splitting field of X’" — X. Hence any 
root of g(X) is also a root of X?" — X. Conversely, if g(X) is an irreducible monic 
polynomial of degree m which is a factor of X?" — X, then the splitting field of g(X) 
is of order p”, and it is contained in the splitting field of X?" — X. Thus, m divides 
n. This shows that all irreducible polynomials of degrees m, where m divides n are 
factors of X?" — X, and they are the only(upto associate) factors of X?" — X. Since 
the roots of this polynomial are all distinct, it has no repeated irreducible factors. 
The arguments above combine to give the following: 


Theorem 8.4.33 The polynomial X”" — X in Z,[X] is the product of distinct irre- 
ducible monic polynomials in Z,|X] whose degrees are divisors of n. tt 


One can develop a computer program to factorize X?" — X in Zp[X] for small 
primes p and for small n. 


Example 8.4.34 Consider the case p = 2 andn = 3. We wish to find all monic 
irreducible polynomial of degree 3 in Z2[X], and also factorize X Boon GG product 
of irreducible factors. Since | and 3 are the only divisors of 3, irreducible factors 
of X* — X are irreducible polynomials of degree 1 and irreducible polynomials of 
degree 3. The irreducible polynomials of degree 1 are X and X + 1 only. Consider the 
irreducible polynomials of degree 3. Let us enumerate the polynomials of degree 3. 
They are X3, X7 + X?, X93 4.X, X34+1, X93 4X? 4X, X9+ NK? 41, X74 X41 
and X? + X* + X + 1. Polynomials of degree 3 which have no roots in Zz are 
X? + X? + LlandX? + X¥ +1. They are all irreducible also. Thus, 


8.4 Separable Extensions 303 
OS a Ae ea a De se ee 


If ais aroot of X? + X? + 1,thena? = a? + 1. Note that a? and a‘ are also 
roots of X°> + X? + 1. If bisaroot of XX + X + 1,thenb? = b + 1. 
Za[X]/ < X34 X%4+1 > and Z[X]/ < X3?+X+1 > are both fields of order 8. 
Determine an explicit isomorphism between them. Compare this with the example 
of a field of order 8 given in Chap. 7, of algebra 1. 


Example 8.4.35 Consider the case when p = 2 andn = 5. Since | and 5 are the 
only divisors of 5, it follows that only irreducible polynomials of degrees 1 and 5 
are factors of X°* — X. Irreducible polynomials of degree 1 (as above) are X and 
X + 1. Irreducible polynomials in Z2[X] of degree 5 are precisely 


XO 4X3 4-1, X94 X72 41,894 X44 494+ X 41, 
XO4 X44 X27 4K 41,204 X44 X34 X27 41, 


and X°+ X3+X?+X-+1. Itis easily seen that X** — X is product of these irreducible 
polynomials. 


Exercises 


8.4.1 Give an example of a inseparable polynomial over some field. 
Hint. Consider the field K = Z,(X) and take the polynomial Y’ — X in K[Y]. 


8.4.2 Let K be a field of characteristic p 4 0. Show that the field K(X) is not a 
perfect field. 


8.4.3 Let K bea field of characteristic p 4 0. Let (X) be an irreducible polynomial 
in K[X]. Show that there exists n > | anda separable polynomial g(X) in K[X] such 
that f(X) = g(X?"). 


8.4.4 Let L be a field extension of K and the characteristic of K is p 4 0. Let a be 
an element of L which is algebraic over K. Show that there is a positive integer n 
such that a?” is separable over K. 


8.4.5 Call an element a of the extension L of a field K of characteristic p # Oa 
purely inseparable element if it is algebraic over K, and its minimum polynomial 
has only one root namely a. Show that a is purely inseparable over K if and only if 
a?’ € K for some n. 


8.4.6 Show that an element a of L is separable as well as purely inseparable if and 
only ifae K. 


8.4.7 Show that if a € L is purely inseparable over K, then K (a) is splitting field of 
the minimum polynomial of a, and G(K (a)/K) is trivial. 


8.4.8 Let L be an algebraic extension of K. Show that every element of L is purely 
inseparable over the separable closure K, of K in L. 


304 8 Field Theory, Galois Theory 


8.4.9 Call an algebraic extension L of K to be a purely inseparable extension if 
every element of L is purely inseparable over K. Show that L is purely inseparable 
of K,. 


8.4.10 Show that if Z is a finite purely inseparable extension of K, then[L : K] = p” 
for some prime p and n. 


8.4.11 Let L be a finite extension of K, and F be an intermediary field. Show that 
L is purely inseparable extension of K if and only if L is purely inseparable over F,, 
and F is purely inseparable over K. 


8.4.12 Let L bea field extension of K. Let K; denote the set of all purely inseparable 
elements of L. Show that K; form a subfield of L called purely inseparable closure 
of K in L. Observe that L need not be separable over Kj. 


8.4.13 Define [L : K]; = [L: K;] and call it the degree of inseparability. Show 
that [L: K]; = [L: F];[F : K];, where F is an intermediary field. 


8.4.14 Let L be a finite normal extension of K. Show that K, is a Galois extension 
of K, and G(L/K) is isomorphic to G(K,/K). Deduce that | G(L/K) | = [L: K],. 


8.4.15 Let K be a field of characteristic p 4 0 anda € K such that a ¢ K?. Show 
that X? — ais irreducible over K. 


8.4.16 Find all irreducible polynomials of degree 4 over Z>. Factorize X'!© — X as 
product of irreducible factors over Zz. Determine the structure of a field of order 16. 


8.4.17 Determine the cubic irreducible polynomials over Z3, and factorize X7’ — X 
over Z3. Determine the structure of a field of order 27. 


8.4.18 Express X* + 1 as product of irreducible elements in Z3[X]. Determine its 
splitting field. 


8.4.19 Show that X* — 7 is irreducible over Zs. 
Hint. Observe that 7 is not quadratic residue mod 5. 


8.4.20 Determine irreducible polynomials of degree 5 over Zs, and factorize 
X83 _— X as product of irreducible elements in Z3[X]. 


8.4.21 Let ~(q, d) denote the number of irreducible polynomials of degree d over 
a field K, containing g elements. Then show that 


q’ = Yamndwqg, a). 
Use the inversion formula to show that 
niy(q,n) = Lajnui(d)q’. 


8.4.22 Find the number of irreducible polynomials of degree 9 over a field K of 
order 125. 


8.5 Fundamental Theorem of Galois Theory 305 


8.5 Fundamental Theorem of Galois Theory 


In this section, we relate the intermediary fields of Galois extensions with the sub- 
groups of the Galois groups. We translate problems in the field theory to the prob- 
lems in group theory. As a simple application, we prove the fundamental theorem of 
algebra. Other applications of fundamental theorem of Galois theory will follow in 
the following sections. 


Theorem 8.5.1 (Fundamental theorem of Galois theory). Let L be a finite Galois 
extension of K. Let S(G(L/K)) denote the set of all subgroups of the Galois group 
G(L/K). Let S(L/K) denote the set of all intermediary subfields of the field extension 
LofK. Then we have a bijective map ¢ from S(G(L/K)) to S(L/K) given by d(H) = 
F (A) (the fixed field of H), and a bijective map w from S(L/K) to S(G(L/K)) given 
by W(F) = G(L/F) such that ¢ and w are inverses of each other. Further, the 
following conditions hold. 


(i) Hy S Hy => F(A) C F(A). 
(ii) Fy © Fy = > G(L/F2) S G(L/F)). 
(iii) |H |= [L: F(A)], and (F(A): K] = [G(L/K): #1]. 
(iv) H is normal subgroup of G(L/K) if and only if F(H) is a Galois extension of 
K, and then G(F (H)/K) is isomorphic to the quotient group G(L/K)/H. 


Proof Clearly, ¢ and w are maps. By the Corollary 8.2.20, H = G(L/F(A)). 
Also since L is a Galois extension of K, for any intermediary field F, L is also 
a Galois extension of F,, and so it follows from the definition of Galois extension 
(Definition 8.2.3) that F(G(L/F)) = F. This shows that ¢ and w are inverses of 
each other. In particular, they are bijective maps also. 

(i) and (ii) are restatements of Proposition 8.2.4. The part (iii) follows from the 
Theorem 8.2.19. 

(iv) Suppose that H is a normal subgroup of G(L/K). We have to show that F(#) 
is a Galois extension of K. Since L is a Galois, and so separable extension of K, every 
element of L is separable over K. In particular, every element of F(H) is separable 
over K. Thus, in any case F'(#) is a separable extension of K. We show that under the 
assumption that H is normal subgroup of G(L/K), F (4) is also a normal extension of 
K. It suffices to show that o(F (H)) C F(A) forall o € G(L/K). Leta € F(A), and 
o € G(L/K).Lett € H.Thent(a(a)) = a((o~'oroc)(a)) = a(a), for o~'oroc 
belongs to H and a € F(A). This shows that (a) € F(A) for all o € G(L/K). 
Thus, F(#) is a Galois extension of K. Conversely, suppose that F(#) is a Galois 
extension of K. Then any 0 € G(L/K) restricted to F(H) is an automorphism 
of F(H). This enables us to define a map ¢ from G(L/K) to G(F(A)/K) by 
o(o) = o/F(A) (the restriction of o to F(A)). Clearly, this is a homomorphism. 
Since L is a Galois extension of K, any element of G(F(H)/K) can be extended to 
an element of G(L/K). Thus, ¢ is a surjective homomorphism. Further, Ker@ = 
G(L/F(H)) = H. This shows that H is a normal subgroup of G(L/K), and by 
the fundamental theorem of homomorphism, G(L/K)/H is isomorphic to 
G(F(H)/K). t 


306 8 Field Theory, Galois Theory 


Corollary 8.5.2. Let L be a finite Galois extension of K. Then there are only finitely 
many intermediary fields. In particular, it is simple extension. 


Proof Since G(L/K) is finite of order [L : K], it has only finitely many subgroups. 
By the fundamental theorem of Galois theory, there is a bijective correspondence 
between the set of subgroups of G(L/K) to the set of intermediary fields. The result 
follows. Finally by the Theorem 8.1.17, it follows that L is a simple extension of K. 


Corollary 8.5.3. Every finite separable extension is simple. 


Proof Let L be a finite separable extension of K. Suppose thatL = K(aj,ap,..., 
an). Let f;(X) be the minimum polynomial of a;. Then each f;(X) is separable. Let 
L’ be the splitting field of f(X) = fi(X)fp(X)...f,(X) containing L. Then L’ is a 
finite Galois extension of K. Hence there are only finitely many intermediary field 
between L’ and K. In particular, there are only finitely many intermediary field in 
between L and K. Again by the Theorem 8.1.17, L is a simple extension of K. tt 


Corollary 8.5.4 Every finite field extension of a field K of characteristic 0 is simple. 


Proof Since every finite extension of a field of characteristic 0 is separable, the result 
follows. tt 


Our next aim will be to use the fundamental theorem of Galois theory to enumerate 
intermediary fields in a Galois extension L of K by calculating the Galois group 
G(L/K), enumerating all subgroups of G(L/K), and finding fixed fields of these 
subgroups. In this section we give some simple examples to illustrate it. 


Example 8.5.5 Recall Examples 8.2.14, 8.2.23, and 8.3.13. The extension L = 
Q(23 Ww) = Q(23 + w) is a Galois extension of Q with Galois group isomor- 
phic to $3. Since $3 has 6 subgroups, there are 6 intermediary fields. We enumerate 
them. Clearly, the fixed field F(S3) of S3 is Q. The fixed field of the trivial sub- 
group is the field Q(23 + w). Consider the subgroup < o > of G(L/Q) generated 
by the element o, where o takes 23 to w23, and takes w to itself. Clearly, < a > 
is the unique normal subgroup of G(L/Q) of order 3, and it is isomorphic to A3. 
Thus, F(< o >) is a Galois extension of Q of degree [G(L/Q) :< 0 >] = 2. 
Also w € F(< o >). Thus, QW) C F(< ao >). Since [(QW) : Q] = 2. 
Hence F(< o >) = QW) = Q(w?). Let 7), 7, 73 be elements of G(L/K) 
given by 71(23) = 23, TW) = w; (23) = w23, mw) = w, and 


73(23) = w223, 73(w) = w?. It is clear that < 7) >, < ™m >, and < 73 > 
1 
are all 3 subgroups of order 2 of G(L/K). Now, 7, fixes the field Q(23), and 
1 1 
[(Q23) : Q) = 3 = [G(L/K) :< | >]. Thus, F(< 7 >) = QQ). 


Similarly, F(< m >) = Q(w223), and F(< 73 >) = Q(w23). This determines 
all intermediary fields. 


Example 8.5.6 Consider the polynomial X* — 5 in Q[X]. By the Eisenstein irre- 
ducibility criteria, it is irreducible over Q. Let L be the splitting field of this poly- 
nomial over Q. Then L is a Galois extension of Q. We find its Galois group, and 


8.5 Fundamental Theorem of Galois Theory 307 


also all intermediary fields. Let a be a real fourth root of 5. Then ta, +ia are 
all four roots of Xt — 5. Thus, +i € L. Hence L = Q(a, i). It is easy to ver- 
ify thatL = Q(a+i). [(Q@) : Q] = 2. Consider the polynomial X* — 5 
over Q(i). Note that Q(i) is the field of fractions of Z[i]. It is easy to observe that 
1 + 2i is an irreducible element of Z[i] which divides 5 (5 = (1 + 21)(1 — 21)) 
but (1 + 2i)? does not divide 5. By the Eisenstein irreducibility criteria, X* — 5 is 
irreducible over Q(i). Since a is a root of Xt — 5,L = Q(i)(a) is of degree 4 
over Q(i). Thus, [L : Q] = [L: Q@)][Q@ : Q] = 8. Now, we find G(L/Q). 
We first find G(L/Q(i)). Consider the automorphism o defined by o(a@) = ai and 
o(i) = i. Clearly, 0 € G(L/Q(d)), and it generates a cyclic group of order 4. Thus, 
<o>= G(L/Q(i)). It has a unique proper subgroup < o” > of order 2. Since 
o*(a*) = a’, it follows that the fixed field of < 07 > is Q(a’, i) (find 3 such that 
Q(a*,i) = Q(@)). Thus, we have a maximal tower Q € Q(i) C Q(a’, i) C Lof 
intermediary subfields. Next, consider Q(q@) which is the fixed field of the subgroup 
< 7 > of order 2, where 7 is defined by 7(7) = —i, and t(a) = a (in fact 7 
is complex conjugation), and which is degree 4 extension of Q. Clearly, this is not 
a Galois extension of Q, as such 7 does not lie in the center. Therefore, the Galois 
group is neither abelian nor the quaternion group. It is, therefore, the dihedral group 
with presentation < 0,7; ot = i = 7*,ToT = o° >. The fixed field of the 
subgroup < 07,7 > which is a normal subgroup of G(L/Q) of order 4 is clearly, 
Q(a’). This gives us another maximal tower Q C O(a?) C Q(a) C L of intermedi- 
ary fields. Consider the element 077. This element takes a to —a and i to —i. This is 
an element of order 2. The fixed field of < 077 > is Q(ai) which is a field extension 
of Q of degree 4, and it is isomorphic to Q(a). Clearly, it contains Q(a7). This gives 
another maximal tower Q € O(a’) € Q(ai) € L. Similarly, looking at other towers 
of subgroups of the group G(L/K), we can find all towers of intermediary subfields. 
This is left as an exercise. 


Example 8.5.7 Let K = C(X) the field of fractions of C[X]. Then, since X is a 
prime element in C[X], by the Eisenstein irreducibility criteria, Y” — X is irreducible 
in K[Y]. Let L be the splitting field of this polynomial over K. We determine the 
Galois group, and also all the intermediary fields. Let a be a root of Y” — X. Let 
p= ev bea primitive nth root of unity. Then pia, 1 < i < 7 are all roots of 
the polynomial Y" — X. Since p € C C K,L = K(a) is the splitting field of 
Y" — X. This shows that L is a Galois extension of K, and the Galois group is of 
order [L : K] = n. We show that this is a cyclic group of order n. Thus, there will 
be 7(n) (the number of divisors of 1) subgroups of the Galois group, and so also the 
intermediary subfields. We determine them. We have an automorphism o in G(L/K) 
given by o(a) = pa. Theno‘(a) = pa. This shows that o is an element of order 
n. Thus, G(L/K) = < o >. Corresponding to any positive divisor m of n, there is 
a unique subgroup < o” > of order m, where n = mr. Consider o"(a*) = pl*a*, 
This shows that K (a”) is contained in the fixed field F(< o” >). Since a” is a root of 
Y" — X,and Y’ — X is irreducible, it follows that [K(a’”) : K] = r =|< o" >|. This 
shows that K (a) is the fixed field of < 0” >. This determines all 7() intermediary 
fields. 


308 8 Field Theory, Galois Theory 


Now, we prove the fundamental theorem of algebra as an application of the fun- 
damental theorem of Galois Theory. The fundamental theorem of algebra states that 
the field C of complex numbers is algebraically closed. This result was first proved 
by Gauss. The routine proof is usually given in a standard complex analysis course 
using the fact that there is no bounded function which is analytic throughout the 
complex plane. We first prove some basic results in the form of Lemmas. 


Lemma 8.5.8 There is no proper odd degree extension of the field R of real numbers. 


Proof Let L be a field extension of the field R of reals such that [L : R] is odd. Since 
R is of characteristic 0, it is a separable and since every finite separable extension is 
simple, there is an element a € L such that L = R(a). Let p(X) be the minimum 
polynomial of a. Then p(X) is irreducible polynomial of odd degree greater than 
1. It is sufficient therefore to show that no polynomial of odd degree greater than 
1 is irreducible over R. Let f(X) be a polynomial of degree 2n + 1 over R. Then 
i re es, = —oo and Lithia = oo. Thus, there exists a such that 
f(@ > Oandf(—a) <_ 0. By the intermediate value theorem, there is ac € R 
such that f(c) = o, and so f(X) can not be irreducible. t 


Lemma 8.5.9 There is no Galois extension L of the field C of complex numbers such 
that(L:C] = 2",n> 1. 


Proof Suppose the contrary. Let L be a Galois extension of C such that | G(L/K) | 
= 2", where n > 2. Then G(L/K) has a maximal normal subgroup H of order 2”~!. 
Since H is a normal subgroup, F(H) is an extension of C of order 2. Suppose that 
F(H) = C(a). Thena ¢ C, and a? € C. This is impossible, for if a” = re’, then 
a = tr2e2 belongs to C. tt 


Theorem 8.5.10 (Fundamental Theorem of Algebra). The field C of complex num- 
bers is algebraically closed. 


Proof Let K be a finite extension of C. We have to show that K = C. Suppose not. 
Then K = C(a\,q@,..-,d,) = RG, a, q@,...,a,), and [K : C] > 2. Let f;(X) be 
minimum polynomial of a; over R. Let f(X) = (X? + Dfi(X)fi(X)...f,(X). Let L 
be the splitting field of f(X) over R. Then L is a Galois extension of R containing K. 
Since [L: R] = [L: C][C: R] = 2[L: C] is even, G(L/R) is of even order, and 
so it has a Sylow 2—subgroup H of order 2” (say). Consider the fixed field F(A) 
of H. Then [F(H) : R] = [G(L/R) : H] is odd. Since R has no proper extension 
of odd degree, we have F(H) = R. This means that G(L/R) = H is a2— group. 
Hence G(L/C) € G(L/R) is also of order 2” for some r > 1. This is a contradiction 
to the above lemma. tt 


Exercises 


8.5.1 Let K bea field, and f(X) € K[X]. Find the splitting field L, the Galois group 
G(L/K), and all intermediary subfields in each of the following cases. 


8.5 Fundamental Theorem of Galois Theory 309 


(i) K = Q, f(X) = X* - 11. 
(ii) K = Q, f(X%) = X® — 10. 
(iii) K = Zs, f(X) = X* — 2. 
(iv) K = Zo, f() = K+ XK 41. 
(v) K = Q, f(%) = X* — 11. 


8.5.2 Find the Galois group of L = Q(V2, V3) over Q. Find all intermediary 
fields. 


8.5.3 Let L be a Galois extension of K with Galois group Zj09. Find the number of 
intermediary subfields and also find them. 


8.5.4 Let L be a finite Galois extension of K andL = K(a). Show that {a(a) | a € 
G(L/K)} is a basis of L over K (This result is known as normal basis theorem). 


8.5.5 Let L be a finite Galois extension of K. Let L; C Ly» be intermediary 
fields which are Galois extensions of K. Show that G(L;/K) is isomorphic to 
G(Ly/K)/G(L2/L1). 


8.5.6 Let L; and Lz be intermediary fields of a Galois extension L of K. Suppose 
that Lz is Galois extension of K. Show that L;L, (the smallest subfield of L contain- 
ing L; and also L) is a Galois extension of L;, and G(L,L,/L;) is isomorphic to 
G(L2/L; () 1). 


8.5.7 Let L be a Galois extension of K. Let 


and 
K = Ly, CL, CG e+ CL = Levees: (8.5.2) 


be two towers of intermediary extensions of the Galois extension L of K, where Kj+1 
is a Galois extension of K;, and L;,, is a Galois extension of L; for all i and j. Show 


that there are refinements 


K=F,/CF,C 


n 
n 

= 
rl 
Bw 


and 
ke Per, eee hai 


of 1 and 2 respectively such that after some rearrangement G (Fj; /F;) is isomorphic 
to G(F;, ,/F;) for all i. 


310 8 Field Theory, Galois Theory 


8.5.8 State and prove the analogue of the Jordan Holder theorem for towers of 
intermediary subfields of a Galois extensions. 


8.5.9 Let L be a finite Galois extension of K with no proper intermediary fields. 
Show that G(L/K) is a cyclic group of prime order. 


8.5.10 Let L be a finite Galois extension with G(L/K) simple. Let F' be an inter- 
mediary field such that K # F ¥ L. Show that there is a K-automorphism o of L 
such that o(F) # F. 


8.5.11 Let L be a Galois extension of degree 15 over a field K. Find the number of 
intermediary fields. 


8.5.12 Let L be a finite Galois extension of K, and H a subgroup of G(L/K). Show 
that F(N(A1)) is the smallest subfield contained in F(H) such that F(A) is a Galois 
extension of F(N(A)). 


8.5.13 Let L bea finite Galois extension of K of degree 35. Let F be any intermediary 
field. Show that F is Galois extension of K. 


8.5.14 Let L be a finite extension of K such that p” is the largest power of prime 
p dividing [L : K]. Show that any two intermediary fields F; and F such that 
[Fi :K] = [F.: K] = p” are K-isomorphic. If m is the number of intermediary 
fields which are extensions of degrees p”, then show that m divides [L : K], and it is 
of the form 1 + kp. 


8.5.15 Let L be a finite Galois extension of K, and H be a subgroup of G(L/K). 
Let F be the normal closure of F(#). 
Show that G(L/F) = (,ceax) oF (H)o™!. 


8.5.16 Let L be a finite Galois extension of K, and H a subgroup of G(L/K). Let 
F(H) = K(a), and S a left transversal of G(L/K) modulo H. Show that 


IL... - 7@) 


is the minimum polynomial of a. Deduce that 
ier Ve SON 


where 77 is the degree [L : K], ris the index of H in G(L/K), and p(X) is the minimum 
polynomial of a over K. Further, deduce the normal basis theorem (Exercise 8.5.4). 


8.5.17 Let F be a field of characteristic p 4 0. LetL = F(X, Y) be the field of 
fractions of F[X,Y] and K = F(X?,Y?). Show that [L : K] = p’ isa finite 
extension which is not simple. Show that there are infinitely many intermediary 
fields. 


8.6 Cyclotomic Extensions 311 


8.6 Cyclotomic Extensions 


Definition 8.6.1 A finite extension L of K is called a cyclotomic extension of K 
if there is an element € € L such thatL = K(€), and é” = 1 for somen. A 
Galois extension L of K is called an abelian extension if the Galois group G(L/K) 
is abelian. It is called cyclic if the Galois group is cyclic. 


A root of X” — 1 ina field K is called a nth root of unity. An element a € K is 
called primitive nth root of unity if its order in the multiplicative group K™ is n. It 
follows that any root of unity is a primitive root of unity for some n. 

Let K be a field. Consider the polynomial X” — 1 in K[X]. Suppose first that 
characteristic of K is p # 0. Suppose that p divides n andn = p’m, where p and 
m are co-prime. Then X” — | = (X” — 1)”". Thus, the roots of X” — 1 and 
that of X” — 1 in any extension field are same. In other words, the splitting field of 
X" — 1, and the splitting field of X” — 1 are same. Let L be the splitting field of 
X” — 1, and so also of X” — 1. Since p does not divide m, all the roots of X” — 1 
are distinct. Let G be the group of roots X” — 1| in L. Since it is a finite subgroup 
of L* of order m, it is a cyclic group of order m. A generator €,, of G is a primitive 
mth root of unity. Thus,G = {€ |O0<r<m-— 1}. 

Note that if €,, is a primitive mth root of unity, then any mth primitive root of unity 
is of the form €’,, where 1 < r < m—land(r,m) = 1.Clearly,L = K(€,). Any K- 
automorphism 7 of L is uniquely determined by its effect on £,,, and whose restriction 
to G is an automorphism of G. This defines an injective group homomorphism from 
the Galois group G(L/K) to Aut(G). From the theory of cyclic groups, it follows 
that Aut(G) is isomorphic to the group U,,, of prime residue classes modulo m. 

Next, if the characteristic of K is 0, then all roots of X” — 1 are distinct, and re 
is a primitive nth root of unity. The above argument is valid when m is replaced by 
n. We can be summarize the above discussion in the following theorem. 


Theorem 8.6.2 Let K be a field, and € be a root of unity. Then K(€) is a Galois 
extension of K. Further, if Em is a primitive mth root of unity, then K(&,) is an 
Abelian extension of K, and G(K(&)/K) is isomorphic to a subgroup of Uy. In 
particular, [K(&) : K] divides o(m). tt 


In general, the Galois group G(K (€,,)/K) need not be exactly U,,,. It, in general, 
depends on the field K. For example, G(Q(i)(€g)/Q(i)) is a subgroup of order 2 of 
the group Us, and Us is of order 4. Note that G(Q(&g)/Q) is isomorphic to Us. 


Remark 8.6.3 In general, minimum polynomials of distinct primitive nth roots of 
unities are distinct over fields of characteristic p # 0. For example, we have the 
factorization 


ada (=e ex oie] xs 


over Zo. Clearly, X? + X + 1 is the minimum polynomial of three of its roots in the 
splitting field which are all primitive 7th roots of unity, and similarly, X* + X? + 1 


312 8 Field Theory, Galois Theory 


is minimum polynomial of the rest of the primitive 7th roots of unity. We shall see 
below that all the primitive nth roots of unity over Q have same minimum polynomial, 
and we shall describe it. 


Definition 8.6.4 Let {p1, p2, ..., Psi} denote the set of all primitive nth roots of 
unity in the field C of complex numbers. The polynomial 


p(n) 
bn(X) = []_, & - a) 
is called the nth cyclotomic polynomial. 


Thus, the degree of ¢,(X) = @(n). Further, since every nth root of unity is a 
primitive dth root of unity for a unique positive divisor of n, we have 


Proposition 8.6.5 X” — 1 = [] ¢a(X). tt 
d/n 


Proposition 8.6.6 @,(X) € Z[X]. 


Proof Consider L = Q(p;), where / is a primitive nth root of unity over Q. Then 
L is the splitting field of X" — 1 over Q. Since any Q-automorphism of L takes 
a primitive root to a primitive root, it follows that ¢,(X) is fixed by all members 
of the Galois group G(L/Q). Since L is a Galois extension of Q, it follows that 
oy (X) is a polynomial in Q[X] whose leading coefficient is 1. First, observe that if 
F(X), g(X) € Z[X], and h(X) € Q[X] are polynomials with leading coefficient 1 
such that f(X) = g(X)h(X), then h(X) € Z[X]. Now, we prove, by the induction 
on n, that 6,(X) € Z[X] for each n. Clearly, 6|\(X) = X + 1 € Z[X]. Assume that 
the result holds for all m less than n. From the previous proposition, we have 


x"—~1=( [][ ¢aX))bn%). 


d/n,d<n 


The left hand side is a monic polynomial in Z[X], and by the induction hypothesis 
the first factor in the RHS (being product of monic polynomials in Z[X]) is a monic 
polynomial in Z[X]. From the earlier observation, ¢,(X) € Z[X]. tt 


Example 8.6.7 The above proposition gives inductive procedure to find nth cyclo- 
tomic polynomial. We illustrate it in this example. If p is prime, then all pth roots of 
1 except | are primitive roots. Thus, 


xX? — | 5 e 
p(X) = X-1 =14+X4+xX"4+.---4+ xX.” 


Clearly, @)(X) = X — 1, @o(X%) = X + 1, (CX) = X? + X + 1. Hence 


Xe — 1 = K-1IK4- 1424+ X4+ os). 


8.6 Cyclotomic Extensions 313 


Thus, 
x®—1 


0) = Daw +K4D 


Theorem 8.6.8 @,(X) is irreducible over Q for each positive integer n. 


Proof Suppose the contrary. Then ¢, (X) is reducible in Q[X]. Since ¢, (X) is amonic 
polynomial in Z[X], it follows that d,(X) is reducible in Z[X]. Let f (X) be an monic 
irreducible factor of ¢,(X) in Z[X]. Suppose that ¢,(X) = f(X)g(X), where g(X) 
is also a monic polynomial in Z[X] of positive degree. Let p be a root of f (X) in the 
splitting field. Then p is a primitive nth root of 1. We show that p” are roots of f (X) for 
all r co-prime to n. The proof is by the induction on. Ifr = 1, then there is nothing 
to prove. Assume that the statement is true for all ¢ less than r > 2 and co-prime to 
n. Let p be a prime dividing r andr = ps. Then since r is co-prime to n, it follows 
that p does not divide n. Thus, p? is also a primitive nth root of 1, and so it is a root 
of @,(X). We claim that it is a root of f(X). Suppose not. Then p? is a root of g(X). 
But then p is a root of g(X”). Since f (X) is the minimum polynomial of p, it follows 
that f(X) divides g(X?) in Q[X]. Since f(X) is monic, it follows that f(X) divides 
g(X”) in Z[X]. Reducing it modulo p, we see that f (X) divides g(X?) = (g(X))? in 
Zp[X]. Let h(X) be an irreducible factor of f(X) (note that f(X) in Zp[X ] is neither 
zero nor a unit). Then h(X) divides g(X) in Z,[X]. This means that h(x) divides 
On(X) = f(X)g(X). But, then ¢,(X) will have repeated roots in splitting field of 
X" — Tover Zp[X]. Since o,(X) divides X” — lin Z,[X], it follows that X" — Thas 
repeated roots in its splitting field. This is impossible for p does not divide n. Thus, 
we see that p” is a root of f(X). By the induction hypothesis p" = p” = (p?)* isa 
root of f(X). Thus, all roots of ¢,(X) are also roots of f(X), and since all the roots 
are distinct, f(X) = @,(X). This is a self contradiction. tt 


Corollary 8.6.9 All the primitive nth roots of 1 have same minimum polynomials 
over Q. If p is a primitive nth root of 1, then Q(p) is a Galois extension of Q with 
Galois group U,. In particular, Q(p) is an abelian extension, and [Q(p) : Q] = o(n). 


Proof We have already seen that Q(p) is a Galois extension with Galois group 
isomorphic to a subgroup of U,,. From the above result it follows that [Q(p) : Q] = 
o(n) =| U, |. The result follows. tt 


Now, we shall describe the cyclic extensions. Recall that a Galois extension L of 
K is called a cyclic extension if the Galois group is cyclic. Since the structure of a 
cyclic group is easy to describe, one can describe the intermediary fields easily. We 
shall first discuss cyclic extensions L of K of order n, where K contains a primitive 
nth root of 1. 


Theorem 8.6.10 Let K be a field which contains a primitive nth root p of 1. Let L 
be a Galois extension of K such that G(L/K) is a cyclic group of order n generated 
by o. Then there is a nonzero element a € L such that p = o(a)aq'. Further, 
L = K(q), where a” = 8 € K,i.e, L = K(x/) for some 3 € K. 


314 8 Field Theory, Galois Theory 


Proof We first show that p € K is an eigenvalue of o, or equivalently, p is a root of 
the characteristic polynomial of o. Since o is of order n, it satisfies the polynomial 
X" — |. Further, by the Dedekind theorem G(L/K) = {I,0¢, o*,...,0" Fis linearly 
independent. Hence o can not be a root of lower degree polynomial. This shows that 
XxX” — 1 is the minimum polynomial of o. Since the dimension of L over K is n, the 
characteristic polynomial is also X” — 1. Since pis a primitive nth root of 1, itis aroot 
of the characteristic polynomial of 7. Hence, 7(a) = pa for some a. It follows that 
o’(a) = p’a = aif and only if n divides r. This means that G(L/K(a)) = {J}. 
This shows thatL = K(a). Also a(a") = (a(a))” = p"a” = a”. This shows 
that a” € F(G(L/K)) = K. Take @ = a”. tt 


The following theorem is converse of the above theorem in some sense. 


Theorem 8.6.11 Let K be a field which contains a primitive nth root of 1. Let L be 
a finite extension of K. Let a € L such that a” € K. Then K(q) is a cyclic extension 
of K. Further, if r is the order of the Galois group of K(a) over K, then r divides n 
and a’ € K, i.e, K(a) = K(s/B) for some B in K. 


Proof Suppose that a” = a e€ K.Thenaisaroot of X” — a. Since K has a primitive 
nth root of 1, it follows that characteristic of K does not divide n, and so all the roots 
of X” — a in its splitting field are distinct (the derivative of X” — a is nonzero). 
Let p be a primitive nth root of 1. Then each p‘q is a root of X" — a. It follows that 
K (aq) is the splitting field of X" — a, and so it is a Galois extension. Let G denote the 
group of roots of | in K. Then G is a cyclic group of order n. Define a map 77 from 
G(K(a)/K) to G by n(a) = o(a)a7! (note that (a) = pa for some r). It is 
easy to see that 77 is a homomorphism which is injective (a member of G(K(a@)/K) is 
uniquely determined by its effect on a). Since subgroup of a cyclic group is cyclic, 
it follows that the Galois group of K(a) over K is cyclic, and it is isomorphic to a 
subgroup of G. Suppose that the order of the Galois group G(K(a@)/K) is r. Then r 
divides n. Let o be a generator of G(K(a)/K). Then o” = J if and only if r divides 
m. Suppose that o(@) = p*a. Then p* is of order r. This means thats = ". The 
result follows if we take 6 = p*a. Clearly p* is primitive rth root of 1. 


Theorem 8.6.12 (Artin-Schreier) Let K be a field of characteristic p 4 0. Let L be 
acyclic Galois extension of K of degree p. Then there isa € K anda € L such that 
L = K(a), where a? — a — a = 0. Conversely, if there is a € K such that there 
isno a € K such that a? — a = a, thenX? — X — ais irreducible in K[X], and 
its splitting field is cyclic Galois extension of K of degree p. 


Proof Suppose that L is a cyclic Galois extension of K of degree p. Let o be a 
generator of G(L/K). Then 0? — I = 0. Thus, o satisfies the polynomial X? — 1. 
Since G(L/K) = {I,¢, o,..., 0" |} is linearly independent (Dedekind theorem), 
a can not satisfy a polynomial of lower degree. In other words, X? — 1 is the 
minimum polynomial of a. Let T denote the linear transformation a0 — J. Then 
T? = (oc — IP = of — I = O. Thus, imageT?"! C KerT. T?-! # 0, for 
otherwise o will satisfy a polynomial of degree p — 1. Further, KerT = {a € L | 


8.6 Cyclotomic Extensions 315 


0 = T(a) = o(a) — a} = {ae L|o(a) = a} = F(G(L/K)) = K. Since 


imageT?~! is a nonzero subspace of KerT = K, it follows that imageT?~'! = K. 
Let G € K such that T7?-'(3) = 1.Takea = T?~?(G). Then T(a) = 1, 
and soa(a) = a + I. Since a is not fixed by o, it follows that a ¢ K. Since 


[K(a) : K] divide [L: K] = p, it follows that = K(q). Further, c(a? — a) = 
a(ay?P — a(a) = (at+1) — (a+1) = a’?+1—-a-1 = a? —a. This 
shows that a? — a belongs to K. Puta = a? — a. ThenX? — X — ais the 
minimum polynomial of a. Conversely, Let a € K be such that there is no 3 € K 
such that G? — @ = a. Let L be the splitting field of f(X) = X? — X —- a. 
Let a € L be a root of f(X). By the hypothesis a ¢ K. Since i? = i for alli in 
the prime field of K, we have p distinct roots a,a+1,a+2,...,a+p-— 1. Thus 
all the roots of f(X) are in K(q@). This shows that L = K(qa). Now, we show that 
Ff (X) is irreducible. Suppose not. Let f(X) = p (X)p2(X)...Pm(X), m > 1 be the 
factorization of f (X) as product of irreducible factors. Let a; be a root of p;(X). Then 
it is also a root of f(X). From the earlier argument, it follows that = K(a;) for 
each i. Thus, degp;\(X) = [L: K]. This shows that p = degf(X) = [L: K]m.This 
is a contradiction, for p is prime, [L: K] > landm > 1. This shows that f(X) 
is irreducible, and so [L : K] = p. Hence the Galois group of L over K is prime 
cyclic. ft 


At last, in this section, we prove a very important and useful result known as 
Hilbert theorem 90. First we have some definitions. 

Let L be a finite field extension of K. Let a € L. The map L, from L to L defined 
by La(x) = ax is a K—linear endomorphism of the vector space L over K. 


Definition 8.6.13 Let L be a finite field extension of K. We define two functions 
Ni /x and T,/x from L to K, the norm and the trace functions, by Nz;x(a) = DetL,g 
and Ty, /x(a) = TraceL,. 


It follows easily that Lap = Llp, Lazy = La + Lp and Lag = aL, for all 
a,b € Landa € K. Since the determinant of product of two linear transformations 
is the product of their determinants, and the trace is a linear functional, it follows 
that Nz jx (ab) = Nr/xK(@)Nr/x(b), and Ty /x is a linear functional on L. Also Ly = 
(Lq)~', and so N;, /K is a group homomorphism from L* to K*. 


Proposition 8.6.14 Let L be a finite extension of degree n, anda € L. Let p(X) be 
the minimum polynomial of a over K. Then p(X) is also the minimum polynomial 
of the linear transformation L,. Further, if degp(X) = m, then the characteristic 
polynomial of Lq is p(X). 


Proof The map 7 from L to Endx (L) defined by 7(a) = Lz, is easily seen to be an 
injective algebra homomorphism (observe that both sides are algebras over K). This 
shows that the minimum polynomial of a over K is same as the minimum polynomial 
of Lg. If ~(X) is the characteristic polynomial of L,, then it is a fact of elementary 
linear algebra (Cayley Hamilton Theorem) that the minimum polynomial p(X) of L, 
divides the characteristic polynomial, and they have same irreducible factors. This 
shows that y(X) = p(X)" for some r. Comparing the degrees, and observing that 


316 8 Field Theory, Galois Theory 


the degree of the characteristic polynomial is same as the dimension n of the vector 
space L over K, we obtain that y(X) = p(X)". tt 


Theorem 8.6.15 Let L be a finite Galois extension of K of degree n. Leta € L. Then 
the characteristic polynomial x(X) of Lg is given by 


xx) = [] & - o@). 


o€G(L/K) 


Proof Let us denote the polynomial leeca /K) (X — a(a)) by W(X). Since a is 
a root of W(X), it follows that the minimum polynomial p(X) of a divides W(X). 
Also given any member 0 € G(L/K), p°(X) = p(X), and so a(a) is a root of 
P(X) for all o € G(L/K). In other words, each o(a) has same minimum polynomial 
p(X). This also says that the only irreducible factor of ~(X) is p(X). Comparing the 
degrees, we obtain that ~)(X) = p(X)m. From the previous proposition, it follows 
that x(X) = W(X). o 


Corollary 8.6.16 Let L be a finite Galois extension of K, and a € L. Then 


Ni/K(@) = I] a(a), 


o€G(L/K) 


and 
o(a). 


TL x(a) x 
o€G(L/K) 


Proof Since the determinant of a linear transformation is product of the characteristic 
roots, and the trace is the sum of the characteristic roots, the result follows from the 
above theorem. tt 


Definition 8.6.17 Let L be a Galois extension of K. A map f from G(L/K) to L* 
is called a 1-cocycle of G(L/K) in L* if f satisfies the following Emmy Noether 
equation. 


fot) = f(oo(f(7)) for all o, tT € G(L/K). 
A 1|-cocycle is also called a crossed homomorphism: 


Proposition 8.6.18 Let L be a finite Galois extension of K, andf be a \-cocycle of 
G(L/K) in L*. Then there is an element a € L such thatf(o) = a(a)a_' for alla € 
G(L/K). 


Proof Sincef(a) 4 Oforalla € G(L/K), and by the Dedekind theorem, G(L/K) is 
linearly independent, it follows that Loeguijmf(o)o # 0. Hence there is an element 
b € Lsuch that Loegujef(a)o(b) # 0. Leta = (Ueegu/eyf(o)o(b))~'. Since 
f isa 1-cocycle, for any T € G(L/K), we have 


8.6 Cyclotomic Extensions 317 


fOG@)’ =f{Ore) =f@r = fee) = 
a€G(L/K) 


x frF@)o)b) = =X fl(ta)(ta)(b) = a. 
o€G(L/K) o€G(L/K) 


This shows that f(r) = r(a)a7! for all tT € G(L/K). tt 


Theorem 8.6.19 (Hilbert Theorem 90) Let L be a cyclic Galois extension of K of 
degree n, and o be a generator of G(L/K). Then the kernel of the homomorphism 
Ni /k from L* to K* is {o(~a7! |a EL"). 


Proof It follows from Corollary 8.6.16 that Nz jx (a) = Nzx(o(a)) for alla € Land 
o € G(L/K). Hence o(a)a™! is in the kernel of Nyx for alla € L* ando € G(L/K). 

Conversely, suppose that u belongs to the kernel of Nz/x. Then Ni/x(u) = 1. 
Define a map f from G(L/K) to L* by f(z) = 1, and for 1 < i < n— 1, we define 
f(o') = uo(ujo?(u)...c'—!(u). We show that f is a 1-cocycle of G(L/K) in L*. 
Suppose that 0 < i, 7 < n—1. There are two cases: (1) i+j < n—1and (ii)i+j > n. 
In case (1) 


flo) = fo) = uoWorw)...0! = fo')o'(fo)). 
Now, consider the case (ii). In this case 
f(aio/) = f(t") = uo(ujo?(u)...0' 4"! ). 
Also 


fla')o'(f (o)) — uo (uo? (u) shes a! (u)o! (uo (u) re o!—!(u)) = 
ua(u)...a't-"—!a-" (ua (u)...0" (uu) = 
f(a'o)o" "(Nix (W)) = f(a'o!). 


This shows that f is a 1-cocycle. From the earlier proposition, there is aa € L such 
that f(a’) = o/(a)a7!. In particular, u = f(o) = o(aja™!. t 


Exercises 


8.6.1 Find out a primitive 16th root of 1, and also the minimum polynomial of 
a primitive root of 1 over Q. What is the degree of the corresponding cyclotomic 
extension over Q. Find also the intermediary subfields. 


8.6.2 Let Q, denote the splitting field of ¥” — 1 over Q, and Q,,, the spitting field 
of X” — 1. Show that Q, () Qn = Qu, where d is the g.c.d of n and m. 


8.6.3 Show that the nth cyclotomic polynomial ¢,,(X) over Q is also given by 


on(X) = [ [Ke — 1, 


d/n 


318 8 Field Theory, Galois Theory 


where ,1 is the Mobius function. 
Hint. Use the Mobius inversion formula and Proposition 8.6.5. 


8.6.4 Find all subfields of Quy. 
8.6.5 Show that cos = and sin are both algebraic numbers. 
8.6.6 Is Q(cos**) Galois over Q. If yes, what is the Galois group? 


8.6.7 Let L be a Galois extension of K, and F be an intermediary field such that F 
is a Galois extension of K. Show that Nz;x = Nr/;KoONzjr and Thx = Tr/KoTz/r. 


8.6.8 Use the theorem Hilbert 90 to prove Theorem 8.6.10. 


8.6.9 Let L be acyclic Galois extension of K. A map f from G(L/K) to L is called 
a l-cocycle of G(L/K) in Lif f(or) = f(o) + o(f(7)) for all o, 7 © G(L/K). 
Use the methods of the Proposition 8.6.18, and Theorem 8.6.12 to prove that the null 
space of Ty /x is {0(a) — a | a € L}. This is known as the additive Hilbert Theorem 
90. 


8.7 Geometric Constructions 


We shall be basically interested in problem of constructions using straight edge and 
compass. 

First, we try to understand the meaning of geometric constructions. We start with 
two points O and P ina plane and take the length of the segment OP as unit. We draw 
the line OP indefinitely and take it as X axis with O as origin and P as marked point. 
We construct points, lines, and circles inductively. At each point of construction, 
we draw a line through an already constructed point, or draw a circle with center as 
one of the constructed point and radius as segment through that point and another 
constructed point, and then take the intersection of newly constructed line or circle 
with already existing lines and circles to construct new points. We can construct a 
line through O and perpendicular to the line OP to draw Y axis. By drawing a circle 
with center as O and radius OP, we get a point Q on Y axis whose coordinate is 
(0,1). Thus, by drawing perpendicular to OQ at Q and to OP at P, and then taking 
their intersection we determine a point whose coordinate is (1, 1). This is how, we 
proceed, and do the constructions. A real number a is said to be constructible if we 
can construct two points which are at a distance | a | apart. 

We recall some standard school level geometric constructions by ruler and com- 
pass. 


(i). We can draw a line perpendicular to a line from any point on that line. 
(ii). We can draw perpendicular to a line from any point outside the line. 
(iii). We can draw a perpendicular bisector to any segment of a line. 


8.7 Geometric Constructions 319 


(iv). Given any segment of a line we can draw an equilateral triangle with the given 
segment as a base. In particular, we can construct a 60° angle. 

(v). Given a quadrilateral, we can construct a triangle which has the same area as 
the given quadrilateral. 

(vi). Given segments of lengths a and b units, we can construct segments of lengths 
a+b, a—b,a > 0, ab, and a/b. In particular, given a unit segment, we can 
construct a segment of length r > o, where r is a rational number. 

(vii). Given any segment of length a, we can construct a segment of length /a. 
In particular, given segments of lengths | a | and | b |, we can construct 
segments of lengths | a | and| ( | where a and £ are solutions of the equation 
X? — (a+b)X + ab. 


If a point (a, b) is constructible, then drawing perpendicular from that point to 
X-axis and also to Y-axis, we see that (a, 0) and (0, b) are also constructible points. 
Conversely, if (a, 0) and (0, b) are constructible points, then drawing perpendicular 
on X-axis through (a, 0) and on Y-axis through (0, b), and then taking the intersection, 
we construct the point (a, b). Next, suppose that the points U and V are constructible 
points, and the length of the segment UV is /. Then we can construct the point (/, 0) 
(and also (0, /)) as follows: We can draw a parallelogram with sides OU and UV. 
Let W be the other vertex of the parallelogram. Then the length of the segment 
OW is I. Draw a circle with center O and radius on the segment OW. The point of 
the intersection of this circle with X axis is the point (/, 0). The above discussion 
concludes to the following proposition. 


Proposition 8.7.1 Let L denote the set of all constructible numbers. Then L x L 
is the set of all constructible points. Further, L is subfield of R, and every positive 
member of L has a square root in L. In other words, there is no proper real quadratic 
extension of L. tt 


Proposition 8.7.2 Let a be a positive real number such that there is a tower Q = 
Ko C Ki © Ko © -:: C Ky of extensions such that K, © R, [Ki41 : Ki] < 2, and 
a € K,,. Then a is constructible. 


Proof The proof is by the induction on n. Ifn = 0, then a € Q, and since the set 
of all constructible numbers is a subfield, a is constructible. Assume the result for 
n. Leta € Kn41, whereQ = Ko C Ki C Ko C::: C Ky © Kyy1 is a tower of 
extensions such that [K;,, : K;] < 2. By the induction hypothesis each member of 
K,, is constructible. Since [Ky41 : Ky] < 2, Kn41 = Kn, or else [Kni1 : Kn] = 2. If 
a € K,, then there is nothing to do. Suppose that a ¢ K,, then [K,41 : K,] = 2, and 
soa = vb for some b € K,. By the induction hypothesis b is constructible. Hence 
from our basic constructions a is also constructible. tt 


Now, we aim to prove the converse of the above proposition. First, we have some 
definitions. 


Definition 8.7.3. Let K be a subfield of R. An element of K x K will be called a 
point in a plane of K. A line passing through two points of the plane of K is called 


320 8 Field Theory, Galois Theory 


a line in the plane of K. A circle with center in the plane of K, and radius a positive 
member of K, is called a circle in the plane of K 


Proposition 8.7.4 A line aX + bY = cinRisaline in the plane of K if and only 
if there is a X € R such that Xa, Ab, Ac € K. 


Proof Suppose that aX + bY = c isa line in the plane of K. Then there are distinct 
points (a1, 3,) and (a, 5) in the plane of K which lie on the line. Thus, 


aa, + bby = c = aan + bho. 


Then a(a} — a2) = d(fo — 1). Since the points are distinct, a; ~ a» or 
By # fr. neta #a2.Thena = b= = a andc = baa “ + bfo.b £0, 
for otherwise a = 0 = c which is not nossible: Now, it is clear that) = D7! 


will serve the purpose. For converse, we may assume that a, b,c € K. Suppose that 
a # 0  b. Then, the line passes through the points (ca~!, 0) and (0, cb~!) in the 
plane of K. Suppose that a = 0 4 b. Then the line passes through (0, cb~') and 
(1, cb~!) which are in the plane of K. Similarly, if a 4 0 = b, then the line passes 
through the points (ca~', 0), and (ca~', 1) in the plane of K. t 


Proposition 8.7.5 Suppose that X* + Y? + 2uX + 2vY + w = Oisacircle 
in K. Then u, v, w € K. Conversely, suppose that u, v, w € K. Then it represents a 
circle with center in the plane of K, and the radius in K (Vu? + v* — w). 


Proof Ifthe given circle is in the plane of K, then the center (—u, —v) is a point in the 
plane of K, and also the radius /u? + v2 — w belongs to K. This means that uv, v and 
w+v2—w belongs to K, and sou, v, w € K. Conversely, suppose that u, v, w € K. 
Then the center (—u, —v) is in the plane of K, and the radius is /utv2 — w. tt 


Proposition 8.7.6 If two lines in the plane of K intersect, they intersect in the plane 
of K. tt 


Proof LetaX + bY = canda'X + b'Y = c’ be two lines in the plane of K. We 
may assume that all the coefficients are in K. Since these lines are not parallel, the 
simultaneous equation has a unique solution in terms of the coefficients which are 
in K. Thus, these two lines intersect in the plane of K. tt 


Proposition 8.7.7 The intersection of a line in the plane of K and a circle in the 
plane of K is either empty set, or a point in the plane of K, or it consists of two points 
in the plane of KVd for some positive d in K. 


Proof LetaX + bY = cbealine in the plane of K, and 


V’r+y+uX +0¥ +w=0 


8.7 Geometric Constructions 321 


a circle in the plane of K. We may assume that all the coefficients are in K and b # 0. 
Substituting y = rah to) in the equation of the circle, and then solving, we get 
that either the solutions are imaginary, and in this case they do not intersect, or they 
intersect at one point in the plane of K (this is when the discriminant of the quadratic 
equation obtained is 0), or it intersects at two points in the plane of K(/d), where 
d is the discriminant of the quadratic equation obtained after the substitution of the 
value y = aut of y in the equation of the circle. tt 


Proposition 8.7.8 Given two circles in the plane of K one and only one of the 
following holds: 


(i). They do not intersect. 
(ii). They touch each other at a point in the plane of K. 
(iii). They intersect at two points in the plane of K(/d) for some positive d € K. 


Proof Let 
xX? 4 Y¥? + 2uX + 2vY+w=0 


and 
X24 Y? + WX +2%WY + uw’ =0 


be two circles in the plane of K. Then u, v, w, uv’, v’, w’ belong to K. The intersection 
of these circles is the same as the intersection of the plane 


(u—u')X + (v—-v)¥ + w-w' =0 


with any of the two given circles. Now, the result follows from the previous propo- 
sition. tt 


Theorem 8.7.9 A real number a is constructible if and only if there is a tower 
Q=KhECKCKHC::-CK, 


of field extensions such that a € K, and [Kji+1 : Ki] < 2. 


Proof From the Proposition 8.7.2, it follows that if a lies in such a tower, then it is 
constructible. We prove the converse. Suppose that a is constructible. Then the point 
(a, 0) can be constructed from a finite number of steps starting from the points in the 
plane of Q. This is obtained by taking intersections of constructible lines and circles 
starting from lines and circles in the plane of Q. In the first step the point will lie in 
the plane of Q or in the plane of K} = Q(./a;) for some positive a; € Q. In the 
second step the point will lie in the plane of K, or in the plane of Ky = Kj(ap). 
Proceeding inductively we arrive at the result. ft 


Corollary 8.7.10 A real number a is constructible only if [(Q(a) : Q] is a power 
of 2. 


322 8 Field Theory, Galois Theory 


Proof Suppose that a is constructible. From the above theorem, a € K,, where 
[K, :Q] = 2” for some m. Since [Q(a) : Q] divides [K, : Q], the result follows. f 


We are now ready to answer the classical problems on geometric constructions. 


Proposition 8.7.11 An angle 0 is constructible if and only if cos0, or equivalently, 
sin is constructible. 


Proof Suppose that cos@ is constructible. Then we can construct the point 
P(cos@, 0). Draw a perpendicular line to the X-axis from this point, and also a unit 
circle with center origin. Let R be the point of intersection of the line with this unit 
circle. Then the angle < POR is the required angle. Conversely, suppose that the 
angle @ is constructible. Let OQ be a line making angle @ with X-axis. Draw a unit 
circle with O as center, and let R be the intersection of this circle with line OQ. Draw 
the perpendicular from R to X-axis which meats it at a point P (say). Then OP is a 
segment of length cosé. tt 


Theorem 8.7.12 /t is impossible to construct 20° angle using ruler and compass 
only. 


Proof From the previous result, if we can construct 20° angle, thena = cos20° is 
constructible. Now 


1 
55 cos60° = 4cos*20° — 3cos20° = 4a* — 3a. 


Thus, a is a root of the polynomial 8X? — 6X — 1. Since this polynomial has 
no rational root (prove it), it is irreducible. Hence [Q(a) : Q] = 3. From the 
Proposition 8.7.2, a is not constructible. tt 


Corollary 8.7.13 Trisection of a 60° angle is impossible by ruler and compass 
construction. 


Proof Since 60° angle can be constructed by ruler and compass, if it is possible to 
trisect 60° angle, then 20° angle is constructible. This is impossible because of the 
above theorem. tt 


Corollary 8.7.14 Jt is impossible to duplicate a unit cube by the ruler compass 
constructions. 


Proof Tf we can duplicate a unit cube by ruler and compass, then we can 
construct a segment of length 23. 3, Since [Q(23) : Q] = 3 is not a power of 2, this is 
impossible. tt 


Recall that a complex number a is called algebraic number if it is algebraic over 
Q. We state a result without proof. 


Theorem 8.7.15 (Lindemann—Weierstrass). Let a), d2,...,, be n distinct alge- 
braic numbers. Then {e"', e@,..., e} is linearly indenendent set over Q. tt 


8.7 Geometric Constructions 323 


Corollary 8.7.16 7 and e are not algebraic over Q. 


Proof If e is a root of a nonzero polynomial over Q, then there are rational numbers 
Qo, @1,..., @, not all zero such that 


ago + aye! +--+ + ane” = 0. 


This means that {e°, e!,...,e"} is linearly dependent. Since 0, 1, 2,..., 7 are alge- 
braic over Q, we arrive at a contradiction to the theorem of Lindemann and Weier- 
strass. Thus, e is not algebraic over Q. 

Since e° and e™ are rational numbers, they are linearly dependent over Q. By the 
theorem of Lindemann and Weierstrass it follows that 0 and zi both are not algebraic. 
Since 0 is algebraic, it follows that 7i is not algebraic. Again, since i is algebraic, it 
follows that 7 is not algebraic. tt 


Remark 8.7.17 It is not known if 7 is algebraic over Q(e). 


Theorem 8.7.18 /t¢ is impossible to construct a square by ruler and compass whose 
area is the area bounded by the unit circle. 


Proof Suppose that it is possible to construct such a square. Then ./7 will be a 
constructible number. Since 7 is not algebraic, ./7 is also not algebraic. But, then 
Q¢./7) is of infinite degree over Q. This is a contradiction. tt 


Theorem 8.7.19 A regular polygon of n side is constructible (by ruler and compass) 
if and only if @(n) is a power of 2. 


Proof A regular polygon of n side is constructible if and cone : the angles “a at 


the center are constructible. This is possible if and only if cos*= is constructible. 
Letp = e7 denote the primitive nth root of 1. Since cos = oe a , it follows 
that cos*= € Q(p). Since p ¢ R and cos*= € R, it follows that Q(cos2# 7) isa 
proper subfield of Q(p). Further, p is a root eo the polynoumal X? — 2cos 2X + 1, 
it follows that [Q(p) : Q(cos* *)] = 2. Hence if cost 2" is constructible, then 


o(n) = [Q(p) : Q] is a power of 2. 

Conversely, suppose that $(7) is a power of 2. Since Q(p) is a Galois extension, 
and which is abelian (isomorphic to U,,) of degree d(n) = 2”, all the SubSroups 
of the Galois group G(Q(p)/Q) are normal. In particular, G(Q(p) /Q(cos* )) is 
normal, and H = GQ(cos™ *) /Q) is an abelian group of order 2’”"~! Fron the 
theory of Abelian groups, we have a normal series 


H= A, bHe---b Ay, = {I}, 
where [H; : Hi4;] = 2. Taking the fixed fields, we obtain a chain 
20 
Q C F(Am—1) Cc F (Am—2) ea F(A) = oes 


such that [F(H;) : F(Hj_1)] = 2. The result follows from Proposition 8.7.2. t 


324 8 Field Theory, Galois Theory 


In particular, it is impossible to construct a regular polygon with 9 sides by ruler 
and compass. It is of course possible to construct a regular polygon of 17 (@(17) = 
2+) sides. An explicit algorithm to construct a regular polygon of 17 sides was given 
by Gauss in 1801. 


Exercises 


8.7.1 Can we divide a right angle in 10 equal parts by ruler and compass? Support. 


8.7.2 Let a and b be positive rationals. Can we construct a square whose area is 
same as that of the ellipse with major axis 2a and minor axis 2b? Support. 


8.7.3, Can we construct an angle 9 Support. 


2n9 


8.7.4 Can we construct a circular arc of length +>? 


Support. 


8.7.5 Show that a regular polygon of n side is constructible by ruler and compass if 
and only if all odd prime divisors of n are of the form 2” + 1. 


8.7.6 Think of a machine which can construct cube root of a rational number. 


8.8 Galois Theory of Equation 


In this section, we determine a necessary and sufficient condition (due to Galois) for 
the solvability of polynomial equations by the field and the radical operations. 


Definition 8.8.1 Let LZ be a field extension of K. We say that L is a radical extension 
of K if there exists elements a), d2,..., a, in L, and positive integers ), 2, ..., My 
such thatL = K(a),a2,...,a,), where a;' € K(a), a, ...a;-1) forall i > 1. If we 
can taken = n; for alli, then we say that it is n - radical extension of K. 


Thus, every radical extension is an algebraic extension. To say that L is a radical 
extension of K is to say that there is tower of finite extensions K = Ky C Kj © 
Ky, © --. CK, = L of finite simple extensions. Further, any radical extension 
is n radical extension forn = nj n2...n,. Thus, every cyclotomic extension is a 
radical extension. Also observe that if L is a n-radical extension of F, and F is also 
a n-radical extension of K, then L is n-radical extension of K. 


Proposition 8.8.2 Let L be an-radical extension of K, and L' the normal closure of 
the extension L of K. The L’ is also n-radical extension of K. 


Proof LetL = K(a,,a,...,a,), where a! € K(a), dz,..., a;-1). The proof is by 
the induction on r. Suppose that L = K(a;) such that a} € K. Let p(X) be the 
minimum polynomial of a; over K. Since a; is aroot of X” — a, wherea = aj € K, 
it follows that p; (X) divides X” — a. Thus, every root (3 of p;(X) satisfies the relation 
B" = aeéK. The normal closure L’ = K((,, o,...(5), where ; are all roots 


8.8 Galois Theory of Equation 325 


of p;(X) over K is therefore n-radical extension of K. Assume that the result is true 
forr. LetL = K(adj, d2,...,4;, 4-41) be n-radical extension of K. Let Lo be the 
normal closure of K (a, a2,...,a,) over K. Then Lo is the splitting field of the set 
{pi(X) | 1 < i < r}, where p;(X) is irreducible polynomial of a; over K, and by the 
induction hypothesis Lo is n-radical extension of K. Then the normal closure L’ of 
L over K is the splitting field of p,+)(X) over Lo. Let 3, G2, ..., G; be the roots of 


Pr+1(X). Then from the assumption, it follows that B? € K(a,,a,...,a-) C Io. 
Thus L’ is n-radical extension of Lo, and Lo is n-radical extension of K. This shows 
that L’ is n-radical extension of K. tt 


Definition 8.8.3 Let K be a field and f(X) € K[X]. Then f(X) is said to be 
solvable by radical operations if the splitting field of f(X) is contained in a rad- 
ical extension of K. 


Proposition 8.8.4 Let K be a field containing nth root of 1, and L be an abelian 
Galois extension such that the exponent of G(L/K) divides n. Then L is a n-radical 
extension of K. 


Proof From the structure theorem of finite abelian groups, we have 
G(L/K) = C) x (2x +--+ x GC, 


where C; are prime power order cyclic groups such that order of each cyclic group 
divides n. Let L; denote the fixed field of 


A; = Ci X Cy x +++ & Cy XK Ci X +++ XC). 


Then G(L;/K) is isomorphic to C; for each i. Thus, L; over K is a cyclic Galois 
extension of degree m;, and since m; divides n, K also contains primitive m;th root of 
1. From the Theorem 8.6.10, we see that L; = K(a;), where aj" € K,and soa? € K 
for all 7. Clearly, K(a,, do,...,da,) = L,Ly...L, is the fixed field of (); Hi = {I}. 
Hence K (a), a2,...,a,) = L. This shows that L is n-radical extension of K. tt 


Theorem 8.8.5 (Galois) Let K be a field of characteristic 0 and f (X) € K[X]. Let 
L be the splitting field of f (X). Then f(X) is solvable by radical if and only if the 
Galois group G(L/K) is a solvable group. 


Proof Suppose that f(X) is solvable by radical operations. Let F' be a n-radical 
extension of K containing the splitting field L of f(X). Since char K=0, we may 
assume, by Proposition 8.8.2, that F is also a Galois extension. Let L’ be the splitting 
field of X" — 1 over F. Since characteristic K is 0, L’ = F(p), where pis a primitive 
nth root of 1. From the previous proposition, L’ is also n-radical extension of K(p). 
Thus, we have tower 


K = KoCK, = K() CK CKC::- CK, = L, 


326 8 Field Theory, Galois Theory 


where Ki;; = K;(a;),i > 1 for some a; such that a? € Kj. Since each K;,i > 1 
contains a primitive nth root of 1, it follows by Theorem 8.6.11 that Kj; is a cyclic 
Galois extension of K; for all i > 1. Since K; is a cyclotomic extension it follows 
that it is abelian. It is also clear that L’ is a Galois extension of each K;, and we have 
a normal series 


G(L'/K) © G(L'/Ki) © G(L'/K2) ©» GWL/K,) = 1 


whose factors G(L'/K;)/G(L'/Ki+1) © G(Ki41/K;) are all abelian (as observed 
above). Thus, G(L’/K) is a solvable group. By the fundamental theorem of Galois 
theory, it follows that G(L/K) is isomorphic to the quotient G(L'/K)/G(L'/L). This 
shows that G(L/K) is solvable. 

Conversely, suppose that G(L/K) is solvable, and 


G(L/K) = Gob Gb..-bG, = {I} 


a normal series of G(L/K) with abelian factors. Let F; = F(G;). Then from the 
fundamental theorem of Galois theory, it follows that Fj; is Galois over F; with 
Galois group isomorphic to abelian group G;/ Gj; ,. Let p be a primitive nth root of 
1, where n is the exponent of G(L/K). Note that it exists because the characteristic 
of K is 0. Let L; = F;(p). Then we get a tower of extensions 


KCWCLC--CL, = Lp). 


It follows, by induction, that Lj4; = L;Fj;1, and by the Exercise 8.6.6, G(Lj+1/L;) 
is isomorphic to a subgroup of the abelian group G(Fj+1/F;). Clearly exponent of 
G(Lj+1/L;) divides n. From the above proposition, it follows that Lj.) is radical 
extension of L; for each i. Since Ly = K(p) is already a n-radical extension of K, 
it follows that L(p) is n-radical extension of K, and so L is contained in a radical 
extension of K. This means that f(X) is solvable by the radical operations. ft 


Let L denote the field K(X), X2,...X,) of fractions of the polynomial ring 
K[X,, X2,...X,], where K is a field of characteristic 0. Then every permutation 
p € S, defines uniquely an automorphism of L. Indeed, S,, is isomorphic to a sub- 
group of Aut(L) (see Example 8.3.11) which we denote by S,, again. Let F denote the 
fixed field of S,,. Then we have seen (Example 8.3.11) that F = K(s1,52,..., Sn). 
The polynomial 


f(X) = X" = 9X") = ee = (HD)"5n = (K — XX — Xn)... K — Xr) 


in F[X] is called a general nth degree polynomial over K. The question is: can 
we determine a formula for X;, X2,..., X, in terms of the symmetric polynomials 
using field and radical operations? In other words, can we solve a general nth degree 
equation over K by the field and the radical operations. Since the Galois group of 


8.8 Galois Theory of Equation 327 


f (X) over F is S,, and S, is not a solvable group for n > 5, the following result of 
Abel and Ruffini follows from the above theorem of Galois. 


Theorem 8.8.6 (Abel—Ruffini) A general nth degree equation, n > 5, over a field 
K of characteristic 0 is not solvable by radicals. tt 


Example 8.8.7 Let f(X) be an irreducible polynomial over Q of degree p, where 
p = 5 is a prime such that it has exactly 2 imaginary roots (which are of course 
conjugate to each other). Then the Galois group of f(X) over Q is S,: We prove 
it. Let L be the splitting field of f(X). Then since f(X) is irreducible, p divides 
| G(L/Q) |. Also G(L/Q) is a subgroup of S, (it acts faithfully on the roots of 
f(X) which are all distinct). Since p divides | G(L/Q) |, by the Cauchy theorem, 
G(L/Q) has an element of order p. This means that it is a cycle of length p. Further, 
the complex conjugation is a member of G(L/Q) which interchanges two roots, and 
therefore represents a transposition. Since a transposition together with a cycle of 
length p generates S,, it follows that the Galois group G(L/Q) is S,. Since S, is 
not solvable, by the above theorem of Galois, f(X) is not solvable by radicals. In 
particular, using elementary calculus, and the Eisenstein irreducibility criteria, we 
see that X° — 4X + 2 satisfies the conditions mentioned above, and so it is not 
solvable by radical. 


Since S,,n < 4 is solvable, every general equation of degree n, n < 4 is solvable 
by radical. In what follows we determine formula for these lower degree general 
polynomial equations. 

1. Quadratic Equations. For general quadratic polynomial X* — s;X + so, the 
formula for X; and X> is given by the Sridharacharya formula. It is given by 


sy+,/s? — 45. 


2 


i= 


2. Cubic Equations. We give the Cardano’s method to solve a cubic equation 
by radical operations. Consider the general 3° equation 


a eR sy Se 0 eases (8.8.1) 


y3 +p¥ +q=0---: (8.8.2) 
where 
Pe: 
a ee (8.8.3) 
3 
and ‘ 
gS a & GAaic (8.8.4) 


328 8 Field Theory, Galois Theory 


If p = O, then the roots are a, aw and aw’, where a = (—q) 3, and w a primitive 
cube root of 1. Suppose that p # 0. Substitute Y = U + V in 2, and then the 
equation reduces to 


P+ v+t_q+ GV +p = V) = 0 vices (8.8.5) 
We set two equations 
UP 4 VF ag = Oscseds (8.8.6) 
and 
3UV +p =0---- (8.8.7) 


in U and V. Since p ¢ 0, U and V both are nonzero. Substituting the value of V 
obtained from the Eq. 8.8.7 in the Eq. 8.8.6, we get that 


3 


40 gt a ES ssa: (8.8.8) 

27 
This is a quadratic equation in U*. Solving we get U> = —$ =r —i. where 
D = —4p? — 27@? (called the discriminant of Y + pY + q). If we take 


U> = —4 + ,/—2., and denote it by P, then by symmetry V3 = —£ — J-i@ 
and this we denote by Q. Pairing the three cube roots of P and Q each such that their 
product is —4, we get the roots VP + SO, w/P + w2SO, w2/P + w/O of 
Y? + pY + q = O. This gives all roots of the given cubic equation. 

3. Bi — quadratic equations. We describe Ferrari’s method of solving the gen- 


eral 4° equation 
SO BO? ae = GX aS Paexees (8.8.9) 


The above equation can also be written as 


—sX + Y 
(x2 + a = AX? + BX +, 
where A = a —-om+Y,B= ae + s3,andC = r — s4. We chose Y so 
that the RHS becomes a perfect square. This is so if B?> — 4AC = 0. This gives us 


a cubic equation in Y which can be solved by the Cardano’s method. Suppose that 
the RHS becomes (PX + Q)?. Then 


—sjX +Y 


xX 
7 2 


= +(PX+Q). 


At last, this quadratic equation can be solved, and we get the solutions of the given 
bi-quadratic equation. 


8.8 Galois Theory of Equation 


Exercises 


8.8.1 Solve the cubic equation X? — 3X7 + 2X + 1= 0. 
8.8.2 Solve the X4 + 4K — 1 = 0. 


8.8.3 Find a polynomial of degree 7 over Q whose Galois group is 57. 


329 


Chapter 9 
Representation Theory of Finite Groups 


In this chapter, we develop the elementary theory of linear representations of finite 
groups over a field F’. We shall assume that the characteristic of F does not divide the 
order | G | of G. The representations over fields F where characteristic of F divides 
the order of G are called Modular representations or Brauer representations, and 
the theory was developed by Brauer. We shall have occasions, of course rare, to 
make some comments about modular representation theory. 


9.1 Semi-simple Rings and Modules 


One of the crucial properties of a field (skew field) F is that given any module M 
over F and a submodule N of M there is a submodule L of M such that M is direct 
sum of N and L. In this section, we study rings with this property. All the rings in 
this section are assumed to be rings with identities. 


Definition 9.1.1 A left R-module M is called a simple left module if it has no proper 
submodules. 


Nontrivial simple left modules over a division ring R are precisely one-dimensional 
spaces. Simple Z-modules are precisely prime cyclic groups. 


Example 9.1.2 Let D bea division ring, and M,,(D) be the ring of n x n matrices with 
entries in D. Let D” denote the additive group of column vectors. Then D” is a left 
M,,(D)-module with respect to the matrix multiplication. Let X be anonzero column 
vector in D", and Y 4 0 be any other nonzero column vector. It is an elementary 
fact of linear algebra that there is a matrix A in M,(D) such that A. X = Y. Thus, 
D" has no proper M,,-submodules, and so it is simple. Let D; denote the subset 
of M,,(D) consisting of those matrices all of whose columns except possibly ith 
column is zero. Then an easy calculation shows that D; is a minimal nonzero left 


© Springer Nature Singapore Pte Ltd. 2017 331 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_9 


332 9 Representation Theory of Finite Groups 


ideal. As a module this is isomorphic to the module D”. Also observe that the subset 
D; of M,,(D) consisting of matrices whose ith column is zero is also a left ideal, and 
it is maximal because M,,(D)/Dj; is isomorphic to the simple left module D”. An 
elementary calculation shows that M,,(D) has no proper two-sided ideal. 


Proposition 9.1.3 A left R-module M is simple if and only if there is a maximal left 
ideal A of R such that M as a module is isomorphic to R/A. 


Proof Since a submodule of R/A is of the form B/A, where B is a left ideal con- 
taining A, it follows that R/A is simple if and only if A is a maximal left ideal. Let 
M be a simple left R-module. Let x 4 O be an element of M. Since M is simple 
Rx = M. Thus the map a ~ ax is a surjective R-homomorphism. By the funda- 
mental theorem R/A is isomorphic to M where A is the kernel of the map. Since M 
is simple, A is maximal left ideal. tt 


Proposition 9.1.4 (Schur’s Lemma) Let M and N be simple left R modules. Then 
a nonzero homomorphism from M to N is an isomorphism. In particular, Endr(M) 
is a division ring with respect to the addition of endomorphisms, and the product as 
composition of maps. 


Proof Let f be a nonzero homomorphism from M to N. Then f(M) is a nonzero 
submodule of N. Since N is simple, f(M) = N and so f is surjective. Next Kerf 
is also a submodule of M different from M, and since M is also simple, Kerf is {0}. 
This means that f is injective. In particular, every nonzero element of Endr(M) is 
invertible, and so Endr(M) is a division ring. t 


Theorem 9.1.5 Let M be a left R-module. Then the following conditions are equiv- 
alent. 


I. M is sum of its simple submodules (equivalently M is generated by its simple 
submodules). 

2. M is direct sum of simple submodules. 

3. Every submodule of M is direct summand. 

4. Every short exact sequence of the type 


splits. 
Proof (1 => 2). Suppose that M = YoeaMa, where {M, | a € A} is the family 
of all simple submodules of M. Let 


X= {Jc A | Loe JMy = Bdges Mg}. 


Then each {a} € X. Hence X 4 J. X with inclusion is a nonempty partially ordered 


set. Let {J,, | uw € Q} be achain in X. Then J = LU ees J,, is a member of X, for 


9.1 Semi-simple Rings and Modules 333 


LacsMa = @OXacsMa (verify). Thus, J is an upper bound of the chain. By the 
Zorn’s Lemma X has a maximal element Jo (say). We show that M = @Xocs Ma. 
Put N = ®YacsMo- Suppose that M; Z N for some 3 € A. Then 3 ¢ Jo. Since 
M; is simple, Mg () N is {0}, or itis Mg. Since it is assumed that My g N, it follows 
that M3 ()N = {0}. Thus, Mg +N = Mg@®N = @®XocnH(}Ma. This shows 
that Jo U{G} € X. This is a contradiction to the supposition that Jo is a maximal 
element of X. Hence M, C N for each a € A. This shows that N = M. 

2 => | is obvious. 

2 => 3. Assume 2. Let M = ®XacsMa, where each M, is a simple submodule 
of M. Let N be a proper submodule of M. As in the proof of 1 => 2, consider 


X= {F OJ | N+ XoerMa = N ® XoerMyg}. 


Since N # M, there is Ma, a € J such that M, is not contained in N. Since M, is 
simple, N(\|M, = {0}, andsoN+M, = N @® M,,. Hence such a {a} is in X, 
and so X is nonempty set. Order it through inclusion. As in the proof of 1 => 2, X 
has a maximal element Fo(say), and then N ®@ YocRMa = M. This shows that N 
is a direct summand. 

3 => 4. Assume 3. Then f (NV) is a submodule of M, and M = f(N)@ K for 
some K. Every element of M is uniquely expressible as f(n) +k, where n € N and 
k € K. It is clear that the map s from M to N defined by s(f(n) +k) = n defines 
a splitting. 

4 ==> 3. Assume 4. Let N be a submodule of M. Then we have a short exact 
sequence 


0— N — M — M/N — 0, 


where the map from WN to M is inclusion, and the map from M to M/N is quotient 
map. From 4, it is a split exact sequence, andso M = N @ K for some submodule 
K (isomorphic to M/N) of M. 

3 => |. Let M F {0} bea left R-module such that every submodule of M is direct 
summand of M. We first show that every submodule N of M also has this property. 
Let K be a submodule of NV. Then K is also a submodule of M, and hence there is 
a submodule L of M such that M = K @L. We show thatN = K @(L{)N). 
Clearly, K (\(L()N) = {0}. Letx € N.Thenx = y + z, where y € K and 
zéL.Sincex,y € N, z€ N.Hencez € L{|N,andsoN = K+(L{()N).Thus, 
every submodule of N is direct summand of N. Next, we show that every nonzero 
submodule N of M contains a nonzero simple submodule. Let x « N, x 4 0. Then 
Rx # {0} is a submodule of N. Consider the surjective homomorphism f from R 
to Rx defined by f(a) = ax. Since f is surjective, and Rx & {0}, it follows that 
Kerf # R. Suppose that A = Kerf. Then A is a proper left ideal of R. By the 
Krull’s theorem, A can be embedded in a maximal left ideal B (say). By the first 
isomorphism theorem R/B ~ Rx/f(B) = Rx/Bx. Since B is maximal, R/B is 
simple, and so Rx/Bx is also simple. Since Rx is a submodule of M, and Bx is a 
submodule of Rx, Rx = Bx ®T for some nonzero submodule T of Rx. T being 


334 9 Representation Theory of Finite Groups 


isomorphic to Rx/Bx is simple. Thus, N contains a nonzero simple submodule of 
M. Let Mo be the sum of all simple submodules of M. Suppose that My 4 M. Then 
from 3, there exists a nonzero submodule No of M such that M@ = Moy @ No. From 
what we have proved above No contains a nonzero simple submodule Lo of M. But, 
then Lo is a simple submodule of M not contained in Mo. This is a contradiction to 
the choice of Mp. Hence My) = M. tt 


Definition 9.1.6 A left R-module M is said to be semi-simple if it satisfies any one 
(and hence all) of the four conditions in the above theorem. 


Every vector space is semi-simple, for all subspaces are direct summands. Since 
Z-simple modules are prime cyclic groups, Z semi-simple modules are precisely 
direct sum of prime cyclic groups. More generally, simple R-modules over a P.I.D. 
are isomorphic to R/ Rp, where p is irreducible element of R. In particular, a PID. 
R is a left semi-simple module over itself if and only if it is a field. 


Proposition 9.1.7 Every submodule of a semi-simple left R-module is semi-simple. 
Every homomorphic image (and so quotient) module of a semi-simple left module is 
semi-simple. 


Proof Let M be a semi-simple left module, and N be a submodule of M. Then as 
observed in the proof of 3 => 1 (Theorem 9.1.5), it follows that every submodule 
of N is a direct summand of N. Hence N is semi-simple. Let ( be a surjective 
homomorphism from M to N. Since M is semi-simple, the exact sequence 


() W,..§ Rer6 a MN KD 


splits. Hence, there exists a homomorphism ¢ from N to M such that Got = Iy. 
Clearly, t is injective homomorphism, and as a result N is isomorphic to the sub- 
module t(V) of M. Since submodule of a semi-simple left module is semi-simple, 
t(N), and so N is semi-simple. tt 


Since a direct sum of direct sums of simple left modules is a direct sum of simple 
modules, we have the following proposition. 


Proposition 9.1.8 Direct sum of a family of semi-simple left modules is a semi- 
simple left module. tt 


Definition 9.1.9 A ring R is said to be a Left semi-simple ring if it is a semi-simple 
left module over itself. 


A field is a semi-simple ring, for it itself is a simple module over itself. Z is not 
semi-simple, for it cannot be direct sum of simple left ideals. Subring of a left semi- 
simple ring need not be left semi-simple. For example, Q is semi-simple ring but Z 
is not. 


Theorem 9.1.10 A ring R is left semi-simple ring if and only if every left module 
over R is semi-simple. 


9.1 Semi-simple Rings and Modules 335 


Proof If every left module over R is semi-simple, then in particular, R is semi-simple 
left module over itself. By the definition, R is left semi-simple. Conversely, suppose 
that R is left semi-simple ring. Then every free left R-module, being direct sum of 
copies of R, is semi-simple left module. Since every left module is quotient of a free 
left module and quotient of a semi-simple left module is semi-simple, it follows that 
every left module over R is semi-simple. tt 


The following corollary is immediate from the previous results. 
Corollary 9.1.11 Let R be a ring. Then the following conditions are equivalent. 


R is left semi-simple. 

Every short exact sequence of left R-modules splits. 

Every left R-module is semi-simple. 

Every left R-module is projective. 

Every left R-module is injective. t 


MRWNS 


It follows from the definition of left semi-simple ring that a ring R is left semi- 
simple if it is direct sum of minimal nonzero left ideals. Since M,,(D), where D is a 
division ring, is the direct sum of its minimal nonzero left ideals D;, it follows that 
M,,(D) is left semi-simple. 


Definition 9.1.12 A left semi-simple ring is said to be simple if it has no nonzero 
proper two-sided ideals. 


Example 9.1.13 Since M,(D) has no nonzero proper two-sided ideals, it follows 
from the above discussion that M,,(D) is a left simple ring. We shall see that every 
simple ring is isomorphic to M,,(D) for some n, and for some division ring D. 


Let R,; and R> be rings. Then left ideals of R; x {0}, and those of {0} x Ro are 
also left ideals of R; x Ro. Thus, if R, and R> are direct sum of minimal left ideals, 
then R; x R> is also direct sum of minimal nonzero minimal left ideals. This proves 
the following proposition. 


Proposition 9.1.14 Direct product of left semi-simple rings is a left semi-simple 
ring. ft 


In particular, we have the following corollary. 


Corollary 9.1.15 Let D,, D2,..., D, be division rings, and nj, nz, ...,n; be pos- 
itive integers. Then 


M,, (D1) x M,,, (D2) Kr XK M,,,(D,) 


is a left semi-simple ring. t 


336 9 Representation Theory of Finite Groups 


One of the main results of this section is to show that any left semi-simple ring is 
isomorphic to such a ring. In particular, left semi-simple rings and right semi-simple 
rings are same. 

Let F be a field and G a group. Recall (see the Sect.7.6 on polynomial rings, 
Algebra 1) the definition of the group ring F(G). F(G) is at first a vector space 
over F' with members of G as basis. Thus, the members of F(G) are formal sums 
XgeGQgg, where all but finitely many a, are 0. The multiplication - in F(G) defined 


by 
(XyeGAgQ)* (LyeG gg) = (VyeG Vg); 


where yg = Ung =gOn~, makes F(G) a ring. Thus, F(G) is an algebra over F, 
and it is called a group algebra. The following result is the first basic result in the 
representation theory of groups. 


Theorem 9.1.16 (Maschke) Let F be a field, and G be a finite group such that the 
characteristic of the field F does not divide the order of the group G. Then F(G) is 
a semi-simple ring. 


Proof Assume that the characteristic of F does not divide the order of the group G. 
We shall show that every left F(G)-module is semi-simple. Let M be a left F(G)- 
module and N a submodule of M. Since F is a subfield of F(G), M is a vector space 
over F’, and N is a subspace of M. Hence, there is a F-subspace L of M such that 
M is a vector space direct sum N @- L of N and L. Let p, be the first projection 
from M to N. Then p, is a F-homomorphism from M to N such that p;(x) = x 
for all x € N. We average p, to make it a F(G)-homomorphism. Define a map p; 
from M to N by 


1 


Pilm) = (n-1)'Xgceg: pig”! +m), 


where n is the order of G(note that characteristic of F does not divide n, and so 
n-1 AO in F). Further, 


Pim; +m) = (n- 1)! Xgeeg- pig” (my +m2)) = (n+ 1)7!Xgegg: pi(g7 m+ 
g 'mz) = (n- 1)! Zygeg(g- pig’ (m1) +.g- pig '(m2)) = pil) + pin), 


and 
Pi(ah-m) = (n-1)'Xyeag- pi(g ‘ahm). 
Putting g-'h = x in the above equation, we get that 


Pi(ahm) = (n-1)"'ahZyegxpi(x'm) = ahpi(m) 


9.1 Semi-simple Rings and Modules 337 


for alla € F,h € G,andm e€ M. This shows that p; is a F(G)-homomorphism. 
Also since N is a F(G)-submodule, for each x € N, ie e€ N, and so pi(g 'x) — 
gu'x. Hence for each x € N, we have 


1 


Pix) = (2-17! Zgegg- pig7'x) = (n-D7'Xgeggg'x = (n-I7'nx = x. 


Thus, pjoi = Iy, where i is the inclusion map from N to M. Hence, M is F(G)- 
direct sum of N and Kerp;. This completes the proof of the fact that M is F(G)- 
semi-simple. ft 


Our next aim is to determine the structure of a semi-simple ring. 
Let 


M= M,®M,@:-:-®M, 
and 

N = Ni 8 N2@---ONn 
be some direct sum decompositions of left R-modules M and N. Let Myw denote 
the set of matrices [¢,,], where ¢,, € Homr(M,, N,). Let f,, denote the homo- 
morphism p,ofoi, from M, to N,, where i, is the natural inclusion of M, in M, 


and ps the sth projection of N to Ns. It is easy to verify that the map ny» from 
Homr(M, N) to Myn defined by 


nun(f) = Lfsr] 


is a bijective map. In fact every x € M is uniquely expressed as x; + x2 +--+ +Xn, 


where x; € M;, and then f(x) = X?_, fs-x,. In matrix form it is expressed by 
x 
fir fi2 ++ fin . 
a ia 
Fini Sn2 a< Finn Xn 


Further, let My; denote the set of matrices with respect to the above given direct 
decomposition of N, and with respect to a direct decomposition of L. Then the 
matrix multiplication induces an external multiplication from Myy x Myx, to Myzt 
(multiplication of entries are composition of maps). It is also easy to observe that 
nuc(gof) = nni(g)-nun(f). In particular, if L is a left R-module with a given 
direct sum decomposition, and M;;, denote the corresponding set of matrices, then 
M_,_ is aring with respect to matrix addition and multiplication. Clearly, this ring is 
isomorphic to the ring Endr(L). 
Let us denote by M” the n times direct sum of M. 


338 9 Representation Theory of Finite Groups 


Proposition 9.1.17 Let M = M/}' @ My’? @---® M", where {M,, M2,..., M;} 
is a set of pairwise nonisomorphic simple left R-modules. Then the ring Endr(M) 
is isomorphic to the direct product 


My, (D1) x My, (D2) Kr XK Mp, (D,), 


where D; = Endr(M;) is a division ring, and M,,(D;) is the ring of nj x nj 
matrices with entries in Dj. 


Proof Since each M; is simple left R-module, by the Schur’s Lemma Endr(M;) isa 
division ring, and fori 4 j, Hompr(M;, M;) = {0}. It follows from the discussion 
prior to the proposition that Endp(M) is isomorphic to the ring of matrices of the 


type 


Ay G-acake -» 
0 ApOvns 2 
0 0.----A,; 0 
0 Wises O A 


, where A; € M,,(D;). The map which associates to each matrix of the above type to 
(A, A2,..., A;) defines an isomorphism from Endr(M) to M,, (D1) x Mn, (D2) x 
pace M,,(D;). t 


Proposition 9.1.18 Let R be a ring with identity. Leta € R. Then the map fa from 
R to R defined by fy(x) = x -a is a member of Endr(R, +), where (R, +) is 
treated as a left R-module. Further, the map f from R to Endr(R, +) defined by 
f(@) = fa is an anti-isomorphism. 


Proof fax +y) = (e+y)-@ = x-aty-a = fy(x) + faly). Also 
falb- x) = b-x-a = b- fa(x). Thus, fa € Endr(R, +). Next, f(a+b)(x) = 
farb(X) = (@tb)-x = a-xtbh-x = f+ fh) = f@OW+tfO@) = 
(f(a)+f (b)) (x) foralla, b,x € R. Thus, f(at+b) = f(a)+ f(b). Next, foo(x) = 
x-ab = fi(fa(x)) for all a,b, x € R. This shows that f(ab) = f(b) f(a) for 
alla, b € R, and so f is a anti-homomorphism. Suppose that f(a) = f(b). Then 
a=a-l= fl) = f@Q) = fOM = fo) = b-1 = db. Thus, f isan 
injective anti-homomorphism. Further, given any f € Endr(R,+), f = f(f()). 
This shows that f is also surjective. tt 


Theorem 9.1.19 Let R be a left semi-simple ring. Then there are only finitely non- 
isomorphic simple left modules each isomorphic to a simple left ideal of R. Let 
{M,, M2,..., M,} be a set of pairwise nonisomorphic simple left R-modules such 
that each simple left R-module is isomorphic to M; for some i. Let D; denote the 


9.1 Semi-simple Rings and Modules 339 


division ring Endr(M;). Then there exists positive integers n,,n2,..., nN, such that 
the ring R is isomorphic to the ring 


My, (D1) x My, (D2) More KX My), (D;)- 


Proof Let R be a left semi-simple ring. Let M be a nontrivial simple left R-module. 
Then there is a maximal left ideal A of R such that R/A is isomorphic to M. Since 
R is left semi-simple, there is a left ideal B of R such thatR = A@ B. But, 
then B, being isomorphic to R/A, is a simple (minimal nontrivial)left ideal, and it 
is isomorphic to M. Since R is a left semi-simple ring, it is direct sum of simple left 
ideals. Suppose that 


R= BX acrAa, 


where A, is a simple left ideal for each a € A. Thus, | € R can be uniquely 
expressed as 


l= Cay + Cay Sp vseioge Cars 


where a; € A, and each e,, # 0 is amember of Ajg,. Since each Ag, is a simple left 
ideal, Reg, = Ag,. Thus, 


R=R-1= Req, + Req, a Soh Sp Rég, = Ag, ® Aa ®+:: ® Aa, - 
In turn, R is direct sum of finitely many simple left ideals. Suppose that 
R= A, @A.G-:- @A,, 


where A; is simple left ideal of R. We show that any nontrivial simple left module 
is isomorphic to A; for some 7. Let M be a nonzero simple left R-module. Then 
{0} 4A M = RM.Hence A;M 4 {0} for some i. Since A; M is anonzero submodule 
of M and M is simple, it follows that A;M = M. In turn, it follows that Ajx 4 {0} 
for some x € M. Since A;x is also a submodule of M, we have Ajx = M. 
Define a map f from A; to M by f(a) = a-x. Then f is clearly a surjective R- 
homomorphism. Since A; is simple and Kerf # A,, it follows that Kerf = {0}. 
Thus, f isan isomorphism from A; to M. This shows that every simple left R-module 
is isomorphic to A; for some i, and so there are only finitely many nonisomorphic 


simple left R-modules. Let {M), Mz,..., M,} be a set of pairwise nonisomorphic 
simple left R-modules such that every simple left R-module is isomorphic to M; 
for some i. Suppose that n; of A,, A2,..., A; are isomorphic to M;. Then as left 


R-module, R is isomorphic to 


M}' ®M;? ®---®M"”. 


340 9 Representation Theory of Finite Groups 
Thus, Endr(R, +) is isomorphic to 
M,, (D1) x M,,,(D2) Mrs XK M,,(D,)- 


The map A ~» A’ defines an anti-isomorphism from M,,,(D;) to itself. Thus, the 
above ring is anti-isomorphic to itself. Further, we have seen that R is anti-isomorphic 
to Endr(R, +). Since composition of two anti-isomorphisms is isomorphisms, the 
result follows. tt 


Corollary 9.1.20 Every left semi-simple ring is anti-isomorphic to itself. tt 


It is clear that R is left semi-simple if and only if the opposite ring of R is right 
semi-simple. Thus, we have the following corollary. 


Corollary 9.1.21 Left semi-simple rings and right semi-simple rings are same. {f 


From now onward, we shall simply write semi-simple ring instead of left semi- 
simple or right semi-simple. 


Corollary 9.1.22 A semi-simple ring is commutative if and only if it is direct product 


of fields. 


Proof The result follows if we observe that M,,(D) is commutative if and only if D 
is afield, andn = 1. tt 


Let M be a left R-module, and $ = Endpr(M). Then M is also a S-left module 
with respect to - defined by f-x = f(x). 


Theorem 9.1.23 (Jacobson Density Theorem) Let M be a left simple R-module 
and S = Endr(M),. Let f © Ends(M), and x,, X2,...,x, € M. Then there exists 
a € R such that f (xi) = ax; for alli. 


Proof We first prove the result for r = 1, and for a semi-simple left R-module 
M. Assume that M is semi-simple. Then there is a submodule N of M such that 
M = Rx, @N. The first projection p; is a member of Endr(M). Hence f(x1) = 
f(pi@)) = pi fa) = pi(f (x1). Thus, f(x) € Rx, and so f(x) = ax 
for some a € R. Next, assuming that M is simple, we prove the result for arbitrary 
r. Consider the map f’ from M" to M" defined by 


f'(U1, U2, +-.5 Ur) = (fr), fa), ..-5 fr). 


Clearly, S’ = Endr(M") = M,(S) is the ring of r x r matrices with entries in S. 
Now, f” preserves addition, and 


f' (h- (ui, u2,...,ur)) = fl Ca yhir- ui, Uiyhi2- ue, ..., Ujayhir - ur), 


where h = [hj;] € S’. Applying the definition of f", and observing that f € 
Ends(M), it follows that the above is same ash - f"(uj,u2,...,u,). This shows 


9.1 Semi-simple Rings and Modules 341 


that f” € Ends (M"). Since M" is semi-simple, for (x1, x2,...,x;) € M’, there 
is aa € R such that f"(x),%2,...,%;) = a+ (%1,X2,...,X,). This shows that 
fi) = ax; for alli. tt 


Remark 9.1.24 The term density theorem for the above result is justified in the 
following sense. Give M the discrete topology. The set M™ of all maps from M to 
M can be considered as product of M copies of M. Give the product topology to 
M™. Ends(M) is a subset of M™. Give the subspace topology to Ends(M). Let 
a € R. The map f, from M to M defined by fu(x) = a-x is easily seen to be a 
member of Ends(M), and the map f from R to Ends(M) defined by f(a) = fa 
is a ring homomorphism. The Jacobson density theorem can be restated by saying 
that f(R) is dense in Ends(M) (justify). 


Corollary 9.1.25 (Burnside) Let V be a finite-dimensional vector space over an 
algebraically closed field F. Then V is simple Endr(V)-module, and it cannot be 
simple over any proper sub-algebra of Endf(V). 


Proof Let v and w be nonzero members of V. Then there is a member T of Endr(V) 
such that T(v) = w. This says that V is a simple left Endr(V)-module. Now, let 
R be a sub-algebra of Endr(V) such that V is left simple over R. We show that 
R = Endrf(V). Since V is a simple left R-module, Endr(V) is a division ring. Let 
a € F.The map f, from V to V defined by f,(v) = a-visamember of Endr(V), 
for fa(g:v) = a-g(v) = g(a-v) = g- fa(v) forall g © Endr(V). The map f 
from F to Endr(V) defined by f(a) = f, is anembedding of F into Endr(V), for 
fa = fpimplies thata-v = b-vforallv € V, and this, in turn, implies that a = b. 
Also if h € Endr(V), then (fgoh)(v) = a-h(v) = h(a-v) = (hofa)(v) for 
all a € F.. This shows that F is embedded as subfield of Endpg(V) contained in the 
center. We identify the embedded subfield by F’. More precisely, we identify a and 
fa. Leth € Endr(V). Let F(h) denote the subfield of Endr(V) generated by F and 
h(note that h commutes with each element of F’). Further, observe that Endr(V) is 
a F-subspace of End (V) which is finite dimensional. Hence, F'(/) is also a finite- 
dimensional subspace over F’. Thus, F(/) is a finite field extension of F. Since F 
is algebraically closed, it follows that h ¢ F. This shows that F = Endp(V). Let 
{v1, V2,..., U-} be a basis of V over F. Let T € Endr(V) = Endgna,v)(V). By 
the Jacobson density theorem, there is ah € R such that T(v;) = A(v;) for all i, 
and so T = h. This shows that R = Endrf(V). tt 


Let F be a field, and G be a finite group. Let V be a finite-dimensional vector 
space over F which is also a F(G)-module. Then the map f from F(G) to Endr(V) 
defined by f(XjeGagg)(v) = (XgeGagg)-v is an injective algebra homomorphism. 
Thus, F(G) can be thought of as a sub-algebra of Endp(V). The following corollary 
is restatement of the above corollary in this situation. 


Corollary 9.1.26 Let F be an algebraically closed field, and G be a finite group. 
Let V be asimple F (G)-left module which as F -space is of dimension n. Then the set 
{fo |g € G} generates Endr(V) asa F-space, where f, is the linear transformation 


342 9 Representation Theory of Finite Groups 


given by f,(v) = g-v. In particular, G contains at least n? elements, and we have 
a subset S of G containing n elements such that the set {fa | 9g € S} is a basis of 
Endf(V). tt 


Let F be a field, and G be a subgroup of the general linear group GL(n, F). Then 
every element X4cega,A of F(G) can also be viewed as a member of M,(F) = 
End,f(F"). In other words, we have a algebra homomorphism from F(G) to M,,(F). 
This makes F” a left F(G)-module. We say that G is an irreducible subgroup if the 
F (G) - module described above is simple. This amounts to say that givenv 4 Oin F”, 
andany w € F”, thereisanelement X4cga,Ain F(G) suchthat DyacgayA-v = w, 
or equivalently, the subspace generated by {A -v | A € G} is V. An other way to 
express this is to say that F” has no nontrivial proper G-invariant subspace. 


Corollary 9.1.27 Let F be an algebraically closed field, and G be an irreducible 
subgroup of GL(n, F). Suppose further that the set {Tr(A) | A € G} is finite, and 
it contains m elements. Then G is finite and contains at most m” elements. 


Proof It follows from the above corollary that the F-subspace of M,,(F’) generated 
by G is M,,(F). We can therefore choose a basis of M,,(F) out of elements of G. 
Let {A!, A2,..., A” } be a basis of M,(F), where A? = [aj] € G forall p <n’. 
Let A = [a;;] be an arbitrary element of G. Let us denote the trace of A? A by a. 
Thus, 


in unknowns x ;;. Since FAY AP cas A™} is a basis of M,,(F), the above system of 
linear equations has a unique solution. Hence, A is uniquely determined by the trace 
al of A’ A. Since the number of traces of elements of G is at most m, we have at 
most m choices for a for each p. The choices for A, therefore, are at most m”. 
This shows that G contains at most m” elements. tt 


Corollary 9.1.28 Let G be a subgroup of GL(n, F) with finitely many conjugacy 
classes, where F is an arbitrary field. Then G is finite. 


Proof Since GL(n, F)isasubgroup of GL(n, F), where F is algebraic closure of F, 
we may assume that F is algebraically closed. We may assume that G is irreducible. 
Since G has only finitely many conjugacy classes, and conjugate elements have same 
trace, it follows that there are only finitely many traces of elements of G. The result 
follows from the above corollary. tt 


We give few applications of the above result to the linear groups. 


9.1 Semi-simple Rings and Modules 343 


Theorem 9.1.29 (Burnside) Let F be a field of characteristic 0. Then all finite 
exponents subgroups of GL(n, F) are finite. Indeed, if G is of exponent m, then it 
contains at most m" elements. 


Proof Clearly, GL(n, F) is a subgroup of GL(n, F), where F is algebraic closure 
of F. Thus, there is no loss in assuming that F is algebraically closed. The proof 
is by the induction on n. If n = 1, then G is subgroup of F* of finite exponent m. 
Since the number of solutions of the equation X¥” = 1 in F is at most m, order of a 
subgroups of F* of exponent m is at mostm = m ! Assume that the result is true 
for subgroups of GL(r, F), where r < n. Then we prove the result for subgroups 
of GL(n, F). Let G be a subgroup of GL(n, F) of finite exponent m, where F is an 
algebraically closed. Suppose that G is irreducible. Since A” = / forall A € G, 
all eigenvalues of elements of G are the mth roots of 1. Since trace of a matrix is 
sum of its eigenvalues, there are at most m” traces of the members of G. It follows 
from the above corollary that G is of order at most (m"y" = m". Now, suppose 
that G is reducible. Then there is a nontrivial proper subspace W of F” which is 
invariant under the multiplication by elements of G. Clearly, DimW = s < n, 
and DimF”/W = t < n.Wehave ahomomorphism p from G to GL(W) defined 
by p(A) = A/W, where A/ W is the restriction of the matrix multiplication on F” 
to W. Then p(G) is a subgroup of GL(W) of exponent at most m. By the induction 
hypothesis, p(G) is finite of order at most m®* . Let H, be the kernel of p. Then by 
the fundamental theorem of homomorphism, H is a normal subgroup of G of index 
at most mm’. Clearly, H; = {AE G|A-w = wforall w € W}. Next, since W 
is invariant under multiplication by the members of G, we have a homomorphism 77 
from G to GL(F"/W) defined by n(A)(v+ W) = A-v + W. By the induction 
hypothesis, 7(G) being of exponent at most m is finite of order at most m‘ * Let Hp 
be the kernel of 7. Then H) is also a normal subgroup of G of index at most m‘ * Let 
H = H,() Hp. Then H is also normal of index at most m> -m® = m+. Since 
stt = nands? +f < (s +1)’, it follows that G/H is of order at most m”. Also 
HZ acts trivially on W as well as on F”/W. Hence, we can find a basis of F” with 
respect to which representation of all elements are upper triangular, and all of whose 
diagonal entries are 1. Since F is of characteristic 0, all nonidentity unitriangular 
matrices are of infinite order, and so AH is trivial. This shows that G is of order at 
most m”. tt 


The group GL(n, F) can be thought of as a subgroup of GL(n + 1, F) by iden- 
tifying an x n matrix A by 
A On.1 
On 1 ; 


GLO, F) CGLQ, F)c-:-C GL(n, F) CGL(n+1, F)C-:: 


The union of the chain 


344 9 Representation Theory of Finite Groups 


is a group denoted by GL(F’). Asubgroup of GL (F) is called a linear group. Itis clear 
that every finitely generated linear group is subgroup of GL(n, F) for sufficiently 
large n. Thus, we have the following corollary. 


Corollary 9.1.30 Every finitely generated linear group over a field of characteristic 
0 is finite if and only if it is of finite exponent. ft 


Burnside conjectured that all finitely generated groups of finite exponent is finite. 
This conjecture turns out to be false. In fact, we have uncountably many 2-generator 
infinite simple groups all of whose nontrivial proper subgroups are cyclic groups 
of same prime order p (p sufficiently large). As such, another modified conjecture 
known as restricted Burnside conjecture was framed. The restricted Burnside conjec- 
ture asserts that for all n and r, there is a finite group RB(n, r) of exponent r which 
is generated by n elements such that every n-generator finite group of exponent r is 
quotient of RB(n, r). This conjecture was finally settled by Zelmanov in 1994. 


Theorem 9.1.31 (Schur) Every torsion subgroup of GL(n, Q) is finite. In fact, there 
exists a function f from N to N such that order of every torsion subgroup of GL(n, Q) 
is less than or equal to f (n). 


Proof It is sufficient to show the existence of a function g such that order of each 
finite-order element of GL(n, Q) is at most g(n), for then, using the above result of 
Burnside, order of each torsion subgroup of GL(n, Q) is at the most f (7), where 
f(a) = g(n)”. We show that if m is order of an element of GL(n, Q), then 
o(m) <n, where ¢ is the Euler’s phi function. This, in turn, implies that there is a 
function g on N such that m is bounded by a function g(7). The proof of the assertion 
is by the induction on n. Ifn = 1, then GL(1, Q) = Q*. The only elements of Q* 
of finite order are | and —1. Since (1) = (2) = 1, the result follows forn = 1. 
Assume the result for all GL(r, Q), wherer < n.Considerthe subgroupG =< x > 
of GL(n, Q), where x is an element of order m. Suppose that G is irreducible. Then 
Q" is a simple Q(G) module. Thus, Endgig)(Q") = D is a division algebra 
over Q. Clearly, the center F of D is a subfield containing Q and x. Since x is 
of order m, it is root of the cyclotomic polynomial ®,,(X) over Q, and ®,,(X) is 
irreducible of degree (m) over Q. Since x ¢ Q, it is irreducible polynomial of x. 


Thus, there exists v € Q”, v # 0 such that {v, xv, x2v,...,x°/”—!v} is linearly 
independent; otherwise, x will be a root of a polynomial of lower degree. This means 
that d(m) < n. This completes the proof of the theorem. tt 


A matrix A in M,,(F) is called unipotent if all its characteristic roots are |. Clearly, 
unipotent matrices are nonsingular. 


Proposition 9.1.32 Let F be an algebraically closed field. Every subgroup G of 
GL(n, F) consisting of unipotent matrices is conjugate to a subgroup of the subgroup 
U(n, F) of uni-upper triangular matrices. 


Proof To say that G is conjugate to a subgroup of U(n, F) is to say that there is a 
basis of F” such that the matrix representation of linear transformations from F” to 


9.1 Semi-simple Rings and Modules 345 


F" determined by the multiplications by elements of G are uni-upper triangular. The 
proof is by the induction on n. If n = 1, then there is nothing to do. Assume that the 
result is true for all subgroups of unipotent transformations in GL(r, F) forr < n. 
We prove it for a subgroup G of GL(n, F). Suppose that G is irreducible. Then, 
since each member of G is unipotent, the trace of each member of G is n. Hence, G 
contains at most 1”” = 1 element. But, then G is the trivial group. Assume that G 
is not irreducible. Then there is a subspace W of F” such that W is invariant under 
multiplication by elements of G. This defines ahomomorphism p from G to GL(W) 
defined by p(A) = A/W, where A/W is the restriction of the multiplication 
by A to W. Clearly, p(G) consists of unipotent transformations in GL(W). By 
the induction assumption, we can find a basis {w), w2,..., Ws} of elements of W 
such that the matrix representation of elements of p(G) with respect to this basis 
is uni-upper triangular. Also elements of G induce unipotent transformations on 
F"/W. This gives us a homomorphism 7 from G to GL(F"/W) such that 7(G) 
is a group of unipotent transformations. By the induction hypothesis, we can find 
a basis {vj + W, v2 + W,..., 0; + W} of F”/W so that the matrix representation 
of each member of 7(G) with respect to this basis is uni-upper triangular. Clearly, 


{W 1, W2,..., Ws, UJ, V2,--., U;} is a basis of F” with respect to which all members 
of G are uni-upper triangular. tt 
Exercises 


9.1.1 Describe all semi-simple modules over a P.I.D. 
9.1.2 Describe all simple modules over RX]. 
9.1.3, What are integral domains which are semi-simple rings? 


9.1.4 Let G be acyclic group of order p. Show that Z(G) is not semi-simple. 
Hint. Zi, is a Z, vector space. It is G-module with respect to the multiplication 


defined by a! - (u,0) = (u +iv, ¥). Show that Zp X {0} isa Z,(G) submodule but 
it is not a direct summand. 


9.1.5 Let G bea finite p-group. Show if a Z,(G) module M is simple, then it is of 
dimension | over Zp. 


9.1.6 Show that the theorem of Schur is not true in GL(n, C). Is it true in GL(n, R)? 
9.1.7 Generalize the last result of the section for arbitrary fields. 


9.1.8 Let V bea vector space of dimension n over a field F with basis {e), @2,..., €n}. 
Define a F'(S,) module structure on V by p+ X?_,aje; = Lj'_, a; p(y. Show that it 
is not simple. Determine a simple submodule of V and its direct compliment. 


346 9 Representation Theory of Finite Groups 


9.2 Representations and Group Algebras 


Let G be a group and V a vector space over a field F. A homomorphism p from G 
to GL(V) is called a linear representation of G over F’. We shall be interested in 
case when V is finite dimensional. Such representations are called finite-dimensional 
representations. The dimension of V is called the degree of the representation. If we 
fix a basis of V, then we get an isomorphism from GL(V) to GL(n, F), and so a 
homomorphism from G to GL(n, F). This is also called a representation, or matrix 
representation of G of degree n. 

Let p be a representation of a group G on a vector space V over a field. Then V 
becomes a left F(G)-module with respect to the external multiplication - defined by 
(XgeGQgg)-V = UgeGagp(g)(v). This module will be termed as module associated 
to the representation p. Conversely, suppose that V is a left F(G)-module. Then 
already V is a vector space over F,, and for each g € G, the map p(g) from V 
to V defined by p(g)(v) = g-v is a linear transformation such that p(gig2) = 
p(g1)0P(g2) for all g,, go € G, and p(e) = Jy. It follows that p(g) is bijective and 
(p(g))! = p(g~') forall g € G. This says that p is a representation of G on V. This 
representation will be termed as the representation associated to the F(G)-module. 
This correspondence between representations over F and F(G)-modules is faithful 
in the sense that each can be recovered from the other. We have already developed 
the language of modules. The language of module theory and the representations 
correspond in the following manner. 

1. F(G)-module <—> representation over F’. 

2. F(G) submodule <— sub-representation. 

3. Simple F(G)-module <—> irreducible representation. 

4. Direct sum of F(G) modules <—> direct sum of representation. 

5. If V and W are left F(G) modules, then both of them are vector spaces over 
F. We can make V ® W a F(G) module by defining g-(v@w) = (g-v)@(g-w). 
The representation thus obtained is called the tensor product of the representation 
corresponding to F(G) module V and the F(G) module W. 

6. Let p be a representation of G associated to the F(G) module V. Then the rth 
exterior power /\" V is a vector space of dimension "C,, where n is the dimension 
of V. This can be made a F(G)-module by defining 


heh) ie ew) oahgw. 


The representation thus obtained is called the rth exterior power of p, and it is denoted 
by A" p. 

7. Let p be the representation associated to the F(G)-module V. Let S’(V) denote 
the rth symmetric power of V. More precisely, S’(V) = (@'V)/A,;), where A, is 
the subspace of ®" V generated by the elements of the type 


Vi @ V2 @++- @VU~ — Vp) @ Vpn @ +++ @ Vp), 


9.2 Representations and Group Algebras 347 


where p is a permutation in S,. We already have a F(G)-module structure on @" V 
defined above which affords the rth tensor power of the representation p associated 
to the F(G)-module V. It is easily noticed that A, is a F(G)-submodule of @’V. 
Hence, S’V is also a F(G) module. The associated representation is called the rth 
symmetric power of p, and it is denoted by Sp. 

Consider the quotient map v from @7V to S?V. The kernel of this map is AG V 
(verify). Thus, @2p = S?p@ /\*p. 

8. Representations associated to isomorphic F (G)-modules are called equivalent. 
Suppose that p is the representation of G associated to the F(G)-module V, and 7) 
a representation associated to the F(G)-module W. Then p is equivalent to 7 if and 
only if there is a F(G)-module isomorphism T from V to W. Clearly, T is a vector 
space isomorphism from V to W such that T(g-v) = g-T(v) forall g € G and 
v € V, or equivalently, T(p(g)(v)) = n(g)(T(v)) for all g € G and v € V. This 
means that 7(g) = Tp(g)T~! for all g € G. Thus, p is equivalent to 77 if there is a 
nonsingular linear transformation T from V to W such that 7(g) = Tp(g)T~! for 
allg €G. 

Given a representation p from G to GL(V), arepresentation 7 from G to GL(W) 
is called a subrepresentaion if p(g)(W) C W for all g € G, and then 7(q) is the 
restriction p(g)/ W for all g € G. A representation p from G to GL(V) is irreducible 
if there is no nontrivial proper subspace W of V such that p(g)(W) © W for all 
g € G. The Mashcke’s theorem can be restated as follows: 


Theorem 9.2.1 Let G be a group and F afield. Suppose that characteristic of F does 
not divide | G |. Then every representation of G over F is direct sum of irreducible 
representations over F. tt 


Thus, to determine all representations of a finite group G over a field whose char- 
acteristic does not divide | G |, itis sufficient to determine nonequivalent irreducible 
representations of G. 


Example 9.2.2 Let G be a group, and V be a vector space over F’. The trivial homo- 
morphism which takes every element of G to the identity map Jy on V is a represen- 
tation called the trivial representation on V. It is irreducible if and only ifdimV = 1. 


A one-dimensional representation of a group G over a field F is exactly homo- 
morphism from G to F*. These are also irreducible representation of G. Distinct 
one-dimensional representations over F are all nonequivalent (why?). 


Proposition 9.2.3. Every irreducible representation of an abelian group over an 
algebraically closed field is one dimensional. 


Proof Let G bean abelian group, F an algebraically closed field, and p an irreducible 
representation of G on V. Consider p(g). Since F is algebraically closed field, p(g) 
has an eigenvalue A, € F(say). The corresponding eigen subspace V,, 4 {0}. 
Leth € Gand v € V,. Then p(g)(p()(v)) = p(gh)(v) = plhg)v) = 
plh)(p(g)(v)) = Agp(h)(v). Thus, V3, is invariant under G. Since p is irreducible 
V), = V,and so p(g) is multiplication by a scalar for each g € G. This shows that 


348 9 Representation Theory of Finite Groups 


each subspace of V is invariant under G. Since p is irreducible representation, there 
should not be any proper subspace of V, and so V is one dimensional. tt 


Thus, to find irreducible representations of abelian groups over an algebraically 
closed field, it is sufficient to find all homomorphisms from G to F*. 

By the Maschke’s theorem, F'(G) is semi-simple if the characteristic of F does 
not divide | G |. By Theorem 9.1.19, it follows that every simple F(G)-module is 
isomorphic to a left ideal of F(G) which is also direct summand of F'(G) considered 
as a left module. It is, therefore, necessary to find the structure of the group algebra 
F(G). First, we find the division rings Endr(g)(V), where V is a simple F(G)- 
module. In case F is an algebraically closed field, we have the following proposition. 


Proposition 9.2.4 Let F be an algebraically closed field, G a finite group, and V a 
simple F(G) module. Then for any T € Endrig)(V), there exists a unique Ay € F 
such that T is multiplication by Xr. Further, the map \ from EndF g)(V) to F defined 
by X(T) = Xr is an isomorphism. 


Proof Let T € Endyig)(V). Then T is a linear transformation on V. Since F is 
algebraically closed, T has an eigenvalue \7 (say). Consider the eigenspace V),. 4 
{0}. Given any v € Vy, andh eG, Tih-v) = h-T(v) = h- Apu = Arh-v. 
This shows that V),. is a F(G)-submodule of V. Since V is simple, V = Vy,, 
and so T is multiplication by A7. The map which takes T to Az is clearly an 
isomorphism. tt 


Theorem 9.2.5 Let F be an algebraically closed field, and G be a finite group such 
that characteristic of F does not divide | G |. Then there are only finitely many 
nonequivalent irreducible representations of degrees n,,nz,...,n, such that the 
following holds. 


(i) The group algebra F(G) is isomorphic as F algebra to 
M,,(F) x M,,(F) x +++ < My, (F). 
(ii) ni tnete +n =|GI. 
(iii) ny = 1 corresponds to the degree of the trivial representation. 


(iv) The number r of nonequivalent irreducible representations is the number of 
conjugacy classes of G (called the class number of G). 


Proof From Theorem 9.1.19 and the above proposition, it follows that there are 
positive integers nj, 2,...,n, such that F(G) as F algebra is isomorphic to 


My, (F) x My), (F) Xrrs K My, (F). 
Comparing the F-dimension of F(G) and that of 


M,,(F) x M,,(F) Kr XK M,,(F), 


9.2 Representations and Group Algebras 349 


we obtain (ii). Clearly, the simple left ideals of the above algebra are isomorphic 
to the simple left ideals of M,,(F) fori = 1,2,...r. All simple left ideals of 
M,,(F) are isomorphic, and are of dimension n;. Thus, 1, 12, ..., 1, represent the 
dimensions over F of simple left F(G)-modules and so they represent the degrees 
of the irreducible representations of G over F’. Since the trivial representation is of 
degree 1, we may assume thatn; = lI. 

Finally, we prove (iv) by comparing the dimension of the center of F (G) and that 
of algebra 


Mi, (F) x M,,(F) , M,,(F). 


The center of M,,(F) is the ring of all scalar matrices, which is a vectorspace of 
dimension | over F’. Hence, the dimension of the center of 


Mi, (F) x Mi (F) PR M,,(F) 


is r. Let {C,, C2,...,C;} be the set of all distinct conjugacy classes of G. Let 
uj = Yyec,x. Since guig | = rec; gxg"! = Yecy = uj, it follows that 
u; is in the center of F(G) for each i. We show that {u), u2,...,u,} is a basis of 
the center of F(G). Since distinct conjugacy classes are disjoint, and the set G is 
linearly independent in F(G), it follows that {w1, v2, ..., u;} is linearly independent. 
Let X4eGa,g be a member of the center of F(G). Then hYgeGaggh"! = LyeGAgJ 
for each h € G. Since G is linearly independent, comparing the coefficients, we 
get thata, = Qygn-1 for all h € G. Thus, a, = a, whenever g and h are 
in same conjugacy class. Leta; = ay for each g € Cj. Then Yijega,g = 
Dt Qiu. This shows that {u), u2,..., u;} forma basis of the center of F(G). Thus, 
t is the dimension of the center of G. Hence, r = f¢ is the number of conjugacy 
classes of G. t 


Remark 9.2.6 We shall show later that the degrees of irreducible representations 
divide the order of the group. 


Example 9.2.7 Let F be an algebraically closed field of characteristic different from 
2. We find the irreducible representations of the Klein’s four group V4. From the above 
results, it follows that there are 4 irreducible representations of V4, and they are all of 
degree 1. Thus, we have to find all 4 distinct group homomorphisms from V4 to F*. 
We list them. Let ; denote the trivial homomorphism which maps each element of 
V4 to 1. Let p2 denote the map which takes e and a to | and b and c to —1. Check that 
itis indeed a homomorphism. Let p3 denote the map which takes e and b to | and the 
rest to —1. p4 is the map which takes e and c to | and the rest to —1. These are the 
only irreducible representations of V4. Note that all these irreducible representations 
of V4 are realized on any field of characteristic different from 2. 


Example 9.2.8 We find all irreducible representations of the Quaternion group Qg 
over an algebraically closed field F of characteristic different from 2. There will be 
as many irreducible representations of Qg as many conjugacy classes of Qs. There 


350 9 Representation Theory of Finite Groups 


are 5 conjugacy classes of Qs. They are {1}, {—1}, {i,—i}, {7, —j} and {k, —k}. 
Thus, there are 5 irreducible representations of degrees 1, 2,3,4, 5 such that 
1+ n5 + na + ny + nz = 8. The only possible solution isn = n3 = n4g = 1, and 
ns = 2. In other words, there are 4 irreducible representations including the trivial 
representation of degrees 1, and there is a unique two-dimensional irreducible repre- 
sentation. We list them. All one-dimensional representations are just homomorphisms 
from Qs: to F*. Note that the kernel of any homomorphism from Qs: to F* contains 
the commutator subgroup {1, —1} of Qx (forF™ is abelian). Since Qg/{1, —1} is iso- 
morphic to the Klein’s four group, we get the four homomorphisms from Qs: to F* 
as in the above example. Thus, we have 4 one-dimensional representations, viz., 1 
the trivial representation, p2 the homomorphism which takes | and —1 to 1, i, —i to 
1, and the rest of them to —1. Similarly, we have two other homomorphisms from Q¢ 
to F*. Finally, we determine the two-dimensional irreducible representation. Since 
F is algebraically closed field of characteristic different from 2, X4 — 1 = 0 has 
4 distinct roots, which form a cyclic group of order 4. Let € denote the primitive 4 
roots of unity. Then the map 


defines a representation which is irreducible. 


Example 9.2.9 Let F be an algebraically closed field of characteristic different from 
2 and 3. We find all the irreducible representations of the symmetric group $3 over 
F. Since there are 3 conjugacy classes of $3, there are 3 irreducible representation of 
S3 over F’. Suppose that there degrees are 1, 22, n3. Then 1 +n3 + ny = 6. The only 
possible solutionisn, = 1 andn3; = 2.Thus, there are 2 one-dimensional irreducible 
representations p;, 92, and | two-dimensional irreducible representation p3. The one- 
dimensional representations p; and 2 are just homomorphisms from $3 to F*. We 
have a trivial homomorphism p; from $3 to F* which maps every member of S$; to 1, 
and a nontrivial homomorphism p2 from $3 to F* given by p2(p) = x(p), where v 
is the alternating map. Now, we describe two-dimensional irreducible representation 
p3. Let V be a vector space over F of dimension 3 with a basis {e1, e2, e3}. Consider 
the representation p from $3 to GL(V) defined by p(p)(x1e1 + x2e2 + x363) = 
X1@ py + X2€p2) + X3@p3). Consider the subspace U = {a(e; +e. +63) | ae F} 
of V. Clearly, U is such that p(g)(U) © U. In the language of modules U is a F(S3)- 
submodule of V. The sub-representation thus obtained is the trivial representation 
p1. Consider the subspace W = {xye; + x2@2 + x3e3 | X) +X2+%3 = O} of V. 
Clearly, W is of dimension 2, and it is also a F'(S3)-submodule. The corresponding 
representation p3 is 2 dimensional. We show that it is irreducible by showing that 
this is simple F(S3) module. Let w = x e; + x2e2 + x3e3 be a nonzero element 
of W. Then at least two of x), x2, x3 are nonzero, and x; + x2 +x3 = 0. Suppose 
that x; 4 0 # x2. We show that there is a permutation p € S3 such that w and 
p3(p)(w) = p- w are linearly independent. Suppose not. Then w and p - w are 
linearly dependent for all p € S3. Thus, for each p € 53, there is a scalar a, such 


9.2 Representations and Group Algebras 351 


thatw = a,p-w. Taking p = (2,3) and comparing the coefficients of e1, é2, 3, 
we find thata, = 1,andx2 = x3. Similarly, x; = x2. Since x;+x2+x3 = 0, we 
see that x; = 0 for alli. This is a contradiction. Hence, there is a p € $3 such that 
w and p - w are linearly independent. This shows that W has no nontrivial F ($3)- 
submodule, and so p3 is the two-dimensional irreducible representation. The exterior 
power /\ z p3 1s a one-dimensional representation which maps even permutations to 1, 
and the odd permutations to — 1 (this representation is called the sign representation). 


Exercises 


9.2.1 Find all irreducible representations of a group of order 15 over the field C of 
complex numbers as well as over the field Q of rational numbers. 


9.2.2 Find all irreducible representations of the dihedral group Dg and also of Aq 
over C and also over Q. 


9.2.3 Find the number of irreducible representations of S4 over C. Find also the 
degrees. Determine the structure of C($4). Find some of the irreducible representa- 
tions of S4 using the method of the last example of this section. 


9.2.4 Show that over any field the number of irreducible representations of a group 
G can be at the most the class number of G. 


9.2.5 Find the number of nonequivalent complex irreducible representation of each 
of the extra special p-groups of order p>. Find also their degrees. 


9.3. Characters, Orthogonality Relations 


Let p be a representation of a group G on a finite-dimensional vector space V over 
a field F. The map y, from G to F defined by x,(g) = tracep(g) is called the 
character of G afforded by the representation p. Characters afforded by irreducible 
representations are called irreducible characters. 


Proposition 9.3.1 Characters are class functions in the sense that they are constants 
on conjugacy classes of G. 


Proof Let x, be the representation afforded by the representation p. Then 
p(ghg-') = p(g)p(h)p(g)~! for all g, h € G. Since similar transformations have 
same trace, it follows that y,(h) = Xp(ghg"') forallg,h EG. tt 


Proposition 9.3.2 Equivalent representations afford same characters. 


Proof Let p and 7 be equivalent representations on vector spaces V and W, respec- 
tively. Then there is an isomorphism T from V to W such that y(g) = Tp(g)T7! 
for all g € G. Hence, y,(g) = trace(n(g)) = trace(p(g)) = Xp (g) for all 
géEG. t 


352 9 Representation Theory of Finite Groups 


Proposition 9.3.3 Let p, and pz be representations. Then Xp,@p) = Xp, +Xp2 and 
Xpi@p2 = Xp? Xpr 


Proof The result follows from the fact that trace(T; + To) = traceT, + traceT> 
and trace(T; ® To) = trace(T,) - trace(T). tt 


From the above result, it follows that sums and products of characters are charac- 
ters, and the set of characters form a semi-ring. We complete this semi-ring to ring by 
putting negatives of characters called the virtual characters. The ring thus obtained 
is called the character ring, and it is denoted by Ch(G). 

Let p be the representation afforded by the F(G)-module M, j: the representation 
afforded by the F(G)-submodule N of M, and v the representation afforded by the 
quotient module M/JN. It follows from elementary linear algebra that tracep(g) = 
traceji(g) + tracev(g) for all g € G. Next, since M is finite-dimensional vector 
space, there is a composition series of F(G)-module M whose factors are simple. 
This proves the following proposition. 


Proposition 9.3.4 Every character (even if the characteristic F divides the order of 
the group) is sum of irreducible characters. tt 


The members of F (G) can be viewed as function from G to F. Indeed, we identify 
the member XjcGayg by the function a from G to F defined by a(g) = a,. A 
character x of G is, therefore, a member of F (G). Since characters are class functions, 
they belong to the center of F(G). 

Let p be a representation of G on a finite-dimensional vector space V over a field 
F. Let {x1, x2,...,Xn} be a basis of V. Then we get n* functions pij from G to F 
defined by 


P(g)(xj) = LVL, pij(g)xi- 


The character x, of p is given by y,(g) = Xj_) pii(g). 
Suppose that the characteristic of F does not divide | G |. Define a map <, > 
from F(G) x F(G) to F by 


<a,8 >= ((G|-l7'Zyega(g) Bg"), 


where | denotes the identity of the field F’. It is easy to observe that <, > isa 
symmetric bilinear form on F(G). Suppose that < a, @ >= 0 forall Gin F(G). 
Then for each h € G, we have a(h) = < a, ip-1 > = Oo, where iy-: is the map 
from G to F which takes h~! to 1, and the rest of the elements to 0. This shows 
that a = 0, and so <, > is a nondegenerate symmetric bilinear form on F(G). 
Such a bilinear form is also called an inner product on F(G). Now, we shall show 
that the set of irreducible characters of G over F form an orthonormal basis of the 
center of F(G). The results which follow are due to Frobenius, and are called the 
orthogonality relations. 


9.3 Characters, Orthogonality Relations 353 


Theorem 9.3.5 Let G be a finite group, and F be an algebraically closed field whose 
characteristic does no divide the order of G. Let p and y be nonequivalent irreducible 
representations. Then 


(1G | -1D)7!Zyeepie(g7 npj(g) = 0 


for alli, j,k and p. 


Proof Let V and W be F(G) modules corresponding to representations p and 7, 
respectively. Let {x1, x2, ..., Xn} bea basis of V, and {y1, y2,..., Ym} be a basis of 
W. Let 7); be the linear transformation from V to W which takes x; to y; and x, to 
0 fork # i. T;; need not be a F(G)-module homomorphism. We average it to make 
a F(G)-module homomorphism. Define Tji by 


Tji(v) = (1 G| 1)" Zgcen(g)Tji(p(g7')(v)). 


Then Tji is a F(G)-module homomorphism from V to W (see the proof of the 
Maschke’s theorem). Since V and W are simple and nonisomorphic F(G)-modules, 


any F(G)-homomorphism from V to W is the 0 map. Since T;; is a F(G)- 
homomorphism from V to W, it follows that T;; = 0. Hence, 


0 = Tile) = (1G |-Do'Zyeon(Q)Tyi(Z Li pg ')2). 


This in turn gives 


(1G | 1)" Zgccpie(g7")ER inp (G)yp = 0. 


Since {y1, Y2,..-, Ym} 18 a basis, we see that 


(|G | 1) Xgcepir(g”')npj(g) = 0 
for alli, j,k and p. tt 


Corollary 9.3.6 Let \, and x, be distinct irreducible characters of G over an alge- 
braically closed field whose characteristic does not divide | G |. Then < Xp,Xn > 
= 0. 


Proof Since x, # Xy, p and 7 are nonequivalent. In the above theorem putting 
k = i and p = j, and then summing over all i and j, we see that < xp, 
Xn > = 0. tt 


Theorem 9.3.7 Let p be an irreducible representation of a finite group G on a vector 
space V of dimension n over an algebraically closed field whose characteristic does 
not divide | G |. Let {x,, X2,..., Xn} be a basis of V, and [p;;(g)] be the matrix of 
p(g) with respect to this basis. Then 


354 9 Representation Theory of Finite Groups 


(i) Lyegpij(g-')pu(g) = Oif j Ak ori Al and 
(ii) n- Xogecpij(g”)pji(g) =| GI -1. 


Proof V isasimple F (G)-module. By Proposition 9.2.4 every F(G)-endomorphism 
of V is multiplied by a scalar. Fix i and j, and then consider the average T;; of the 
linear transformation 7;; which maps x; to x;. Suppose that this endomorphism Tj; 
of V is multiplication by a;;. Then Tj;(x,) = ajix, for alli, j and k. Now 
Tye) = (G1) Egeep(g)(Tii (0 ')@x))). 

In turn, we get that 

aire = (1G | D7! Z geo ZL p1j (G)pin (GA. 
Since {x;, X2,..., Xn} 1s a basis, we get that 

(1G | 1)" Zyec py (Qpix(g”') = 0 
ifk Al and 
aj = (G1 Z yee prj (Dpix(g')- 
Applying the above argument again, we see thata;; = Oifi 4 j. Also 
a = 1G] 1)! Zyeapu (gpg ') = one 


for all i, k. Now, 


nai = (|G | 172 Zgee pei (pie (G7 "))) 
= (1G | 1) (XQ geG (ZL pei (pik (G7 ')))- 


Next, 1 = pj(g7'g) = Dp xi (g)pix(g7'). Hence 
no = (IG |)" 0G|-1) = 1. 
This shows that 
n-(|G| 1)" Zgecpei(gpn(g7') = 1. 
Multiplying | G | -1 we get the result. tt 
Corollary 9.3.8 (Orthogonality relation) Let G be a finite group and F an alge- 
braically closed field whose characteristic does not divide the order of G. Then the 


set of irreducible characters of G over F form an orthonormal basis of the center of 
F(G) (which can be interpreted as vector space of class functions on G). 


9.3 Characters, Orthogonality Relations 355 


Proof By Corollary 9.3.6, it follows that distinct irreducible characters are orthogo- 
nal. Next, if y, is an irreducible character afforded by the irreducible representation 
p, then from the above theorem it follows that 


< XpxXe >= I E/-D* Dyeexp(Qxp(g') 
= (1G I-17! (Xgeg( 27 pit (9) Ef) Pre (g7"))) 
S eG |-1)" Depp) = 1, 


This shows that the set of irreducible characters form an orthonormal set. Since the 
number of irreducible characters is the class number of G, and the class number is the 
dimension of the center (the space of class functions on G) of F(G), it follows that 
the set of irreducible characters form an orthonormal basis of the center of F(G). # 


Corollary 9.3.9 Let G be a finite group, and F be an algebraically closed field of 
characteristic 0. Then a representation p over F is equivalent to a representation n 
over F ifand only if x, = Xn- 


Proof Clearly, equivalent representations have same characters. Conversely, sup- 
pose that p and 7 are representation such that x, = YX, Let {p1, p2,..., pr} 
be the set of pairwise nonequivalent irreducible representations such that each 
irreducible representation is equivalent to one of them (r is the class number 
of G). Then there exists nonnegative integers n,,2,...,m;, and mj, m2,...,m, 
such that p is equivalent to nip; ® nop2 ® --- P n-p;, and 7 is equivalent to 
mip, B Mop. ® --- @ m,p,. Then Xp = MXp + M2X pp +0 F MrXp,s and 
Xn = M1Xp + M2X%p, + +++ + MpXp,- Since Xp = Xn, by the orthogonality 
relation, we have nj-1 =< YXp,X%p > =< Xn Xp, > = m- 1. Since F is of 
characteristic 0, we get thatn; = m; for alli. This shows that p is equivalent to 77. 


Corollary 9.3.10 Let p be a representation of G over an algebraically closed field 
of characteristic 0. Then p is irreducible if and only if < Xp,X%p >= 1. 


Proof Let {p1, p2,..., pr} be a complete set of pairwise nonequivalent irreducible 
representations. Then p = ™1p1@m2p2.@- --@m,p,;, where each m; is anonnegative 
integer. Then xp = m1Xp, +m2Xp, +-+:+m,Xp,- From the orthogonality relation, 
we find that < y,,X) >= mi +m}+---+m?. Thus, < x,,X) >= Lif and 
only ifm; = 1 fora unique i, and the rest of m; = 0. Thus, < y,,X,) >= Jif 
and only if p is equivalent to p; for some i. ft 


Let p be a irreducible representation of a finite group G over a field F whose 
characteristic does not divide | G |. Then V is a simple F (G)-module, and we have 
a homomorphism p from the ring F(G) to Endr(V) defined by p(X yeGagg) = 
XgeGa,P(g). If Xzegagg is in the center of F(G), then p(X,<Ga,g) commutes 
with p(g) for all g € G, and so it belongs to Endy(g)(V). Since V is a sim- 
ple F(G)-module, members of Endy g)(V) are multiplications by scalars. Let 
{C1, C2,..., C,} be the set of distinct conjugacy classes of G. Letu; = XYgec,g. 
Then as observed, {u;, U2,...,u,} form a basis for the center of F(G). From the 
previous observation p(u;) is multiplication by a scalar a; (say). 


356 9 Representation Theory of Finite Groups 


Proposition 9.3.11 Let G be a finite group, F an algebraically closed field of char- 
acteristic 0 and p, an irreducible representation of G over F. Then the scalars a; 
described in the above paragraph are algebraic integers. 


Proof Letu; = Xgec, g, where {C;, C2, ..., C,} is the set of all distinct conjugacy 
classes of G. Let v € C;, and ay denote the cardinality of the set Xi = {(g,h) € 


C; x Cj | gh = v}. If w € Cy, then there is x € G such thatw = xux—!. The 


map (g, h) ~» (xgx7!, xhx7') is clearly a bijective map from X;; to X;;. Thus, the 


integer af depends only oni, j, k, and not on the choice of v € C,. This also shows 
that 


ujuj = Dhaai Ue- 
Thus, 
Duj)puj) = Dujuj) = Dp_yasplur). 


The left-hand side is multiplication by a;a;, and the R.H.S. is multiplication by 
wpa OK. This shows that 


_ r k 
aj; = VG; 


for alli, j and k. We can take C to be the conjugacy class {e}, and sou, = e. Since 


plu) = ple) = Iy,a; = | # 0. The above equation shows that the column 
vector 

Q) 

a2 

Qy 
is an eigenvector of the matrix [b;,], where bj, = a. and the corresponding 
eigenvalue is a;. Thus, a; is a root of the monic polynomial det (xI — [bjx]) whose 
coefficients are all integers (note that bj, = al are all nonnegative integers). This 
shows that each a; is an algebraic integer. ft 


Corollary 9.3.12 Let G be a finite group, and F be an algebraically closed field 
of characteristic 0. Let p be an irreducible representation over F of degree n. Let 
g € G. Letm = [G: Cg (qg)] be the number of conjugates to g. Then mXol) is 
an algebraic integer(observe that every algebraically closed field of characteristic 
0 contains the field of algebraic numbers). 


Proof Let C; be the conjugacy class determined by g. Then the trace of p(g) = the 
trace of p(x) foreach x € C;. Thus, tracep(u;) = m-tracep(g) = m-x,(g). Since 


9.3 Characters, Orthogonality Relations 357 


pis multiplication by a;, and the degree of pis n, it follows that tracep(u;) = n-aj. 
Thus,a; = ‘ The result follows from the above theorem. t 


Corollary 9.3.13 The degree of every irreducible representation of a finite group 
G over an algebraically closed field F of characteristic 0 divides the order of the 


group. 


Proof Let p be an irreducible representation of a finite group G of degree n over an 
algebraically closed field F’. Let {C,, C2, ..., C,} be the set of all distinct conjugacy 
classes of G. Then x, is constant on each C;. Let ; be the value of ~, on C;. Then 
from the above corollary, it follows that “7 4 is an algebraic integer, where mj; is 
the number of elements in C;. Let p be the ‘permutation of {1,2,..., 7} defined by 
Cr '=C pi), Where C; | is the set of inverses of the members of C ; (note that C;- ; 
is again a conjugacy class). By the orthogonality theorem, we have 


a re LyeGXp(G)Xp(G')- 


1 
|G | 
Thus, |G |= X/_ mj; Gi 8p. In turn, 


ce ele 


n 


—— Bi: 


From the previous result “ & is an algebraic integer. Also each (3; is trace of p(g) 


for g € Cj. Since G is finite, g' = efor some ft, and so p(g)’ = Iy. This shows 
that the cigenvalles of p(g), being roots of unity, are algebraic integers. Since sum 
of algebraic integers are algebraic integers, it follows that G,(;3, = tracep(g) = the 
sum of the eigenvalues of p(g), g € Cpvi), 18 an algebraic integer. Again, since sums 
and products of algebraic integers are algebraic integers, it follows from the above 
identity that I¢| is an algebraic integer. We also know that a rational number is an 
algebraic jrilewer if and only if it is an integer. Thus, & is an integer. This means 
that n divides | G |. ft 


Following is a simple application of representation theory. 


Proposition 9.3.14 Let G be a finite simple group of order n, and p be a prime 
such that the number of conjugacy classes of G is greater than Z Then Sylow 
p-subgroups of G are abelian. 


Proof We may assume that p* divides n. Since G is simple, every nontrivial complex 
representation of G is injective. Now,n = 1+n3+n3+- . -+n?, where N2,N3,...,N,- 
are the degrees of nontrivial irreducible representations, and r the class number of G. 
Since the class number r is greater than “,, there isi > 2 such thatn; < p. Consider 
the corresponding irreducible representation p;. Let P be a Sylow p-subgroup of G. 
Consider the restriction p;/P to P. Then p;/P is a faithful representation of P of 
degree less than p. The degrees of the irreducible components of p;/P must divide 


358 9 Representation Theory of Finite Groups 


the | P | = p’, where t > 2. Since the degree of p;/P <_ p, it follows that 
all irreducible components of p;/P are of degree 1. Since p;/P is faithful, P is 
abelian. tt 


Proposition 9.3.15 Let x, be an irreducible character afforded by the irreducible 
representation p of degree n of a finite group G over an algebraically closed field F 


of characteristic 0. Let g be an element of G with m conjugates such that m and n 
Xo(g) 


are co-prime. Then “= is an algebraic integer. 


Proof By the Euclidean algorithm, there exists integers u and v such thatum-++un = 
1. Hence Xelg) =u: meatal + vxX,(g). By the previous result, mXo(d) is an algebraic 
integer. Also since eigenvalues of p(g) are roots of unity (note that (p(g)'¢! = Iy)), 
it follows that y,(g) is an algebraic integer. Since sums and products of algebraic 


t Xp(Q) 
n 


integers are algebraic, it follows thai is an algebraic integer. ft 


Proposition 9.3.16 Under the hypothesis of the above proposition, x,(g) = 0, or 
else p(g) is multiplication by a scalar. 


Proof We first show that p(g) is multiplication by scalar if and only if all the eigen- 
values of p(g) are same. One way is evident. Suppose that all the eigenvalues of p(g) 
are same. Consider the restriction p/ < g > of p to the cyclic subgroup generated 
by g. By the Maschke’s theorem p/ < g > is direct sum of irreducible representa- 
tions of < g >. Since irreducible representations of < g > are one dimensional, it 
follows that p(g) is diagonalizable. Since all the eigenvalues of p(g) are same, it is 
multiplication by a scalar. 

Now, suppose that p(g) is not multiplication by a scalar. Then all eigenvalues of 
p(g) are not same. Let A), A2,..., An be eigenvalues of p(g). Then each 4; is a root 
of unity, and so| A; |= 1. Since all of them are not same | x,(g) | =| Zj_,Ai | <n. 


Xol9) is an algebraic integer such that | woh) |< 1. 


By the previous proposition, = 
Let o be an automorphism of a finite Galois extension K of Q containing each 
A;. Then o (Xe) is also an algebraic integer whose modulus is less than 1. Let 


Z = [oeaurcxy 7¢ xD) ). Then z is an algebraic integer and | z | < 1. It is clear that 


o(z) = z forall o € Aut(K). Since K is a Galois extension of Q, it follows that 
z € Q The only rational algebraic integers are integers, and therefore z € Z. Since 
| z|< 1, it follows that z = 0. This shows that Xela) = 0, and so x,(g) = 0. ff 


Following is a criteria, due to Burnside, for non-simplicity of a finite group. 


Theorem 9.3.17 Let G be a finite group which has a conjugacy class containing 
p”™ elements, where p is a prime and m > |. Then G can not be simple. 


Proof Yf G is abelian, then there is nothing to do. Assume that G is non-abelian. 
Suppose that G is simple, and g € G be such that there are exactly p” conjugates to 
g. Let p be a nontrivial irreducible complex representation of degree n such that p 
does not divide n. Since G is simple and p is nontrivial, it follows that p is injective, 
and so G is isomorphic to p(G). Suppose that y,(g) # 0. Then from the previous 


9.3 Characters, Orthogonality Relations 359 


result p(g) is multiplication by a scalar. This means that p(g) is in the center of p(G). 
Since p(G) is simple, p(g) = Jy. Again, since p is injective, g = e the identity of 
G. This is a contradiction to the supposition that g has exactly p” > 1 conjugates. 
Hence, y,(g) = O whenever the degree of p is not divisible by p. Let \,¢, denote 
the character of the regular representation p,¢,. Then 


Xreg = X1 1 N2oX2 Fees + MrXr, 


where xj 1s the trivial character, and x; is the irreducible character of degree n;. From 
Proposition 9.3.16 and the previous observation, it follows that \;¢,(g) = 1(modp). 
One also observes that the matrix of p,e,(g) with respect to the basis G of F(G) has 
no nonzero entry in the diagonal. Hence, \;e,(g) = 0. This is acontradiction. 


Corollary 9.3.18 (Burnside) Every group of order p’q° is solvable, where p and q 
are primes. 


Proof Assume contrary. Let G be a counter example of the smallest order. Then if H 
is anontrivial proper normal subgroup of G, then H and G/H are both solvable. But 
this will mean that G is solvable. Hence G is simple. Suppose that | G |= p’q’. 
Clearly, r,s > 1. Let Q be a Sylow q-subgroup of G. Let g 4 e be a member 
of the center of Q. Then Cg(g) contains Q, and it is not G (for then g will be in 
the center, a contradiction to the assumption of simplicity of G). This shows that 
[G :Cg(g)] = p” for some m > 1. Again from the previous theorem G cannot be 
simple, a self-contradiction. tt 


Remark 9.3.19 The representation theoretic proof of the above result came quite 
early in the twentieth century. A nonrepresentation theoretic proof of the result was 
given by Thomson, Bender, and Goldschmidt quite late around 1976. Now, we have 
a more general result due to Kegel and Wielandt which says that product of any two 
nilpotent groups is solvable. 


Exercises 


9.3.1 Show by means of an example that nonequivalent representations over a field 
of positive characteristic may have same character. 


9.3.2 Show by means of an example that the degree of an irreducible representation 
over a field F need not divide the order of the group. 


9.3.3 Find all irreducible representations of Qg and Dg over C. 


9.3.4 Can we realize all complex irreducible representations of a finite group G 
over Q. 


9.3.5 Determine the number of irreducible complex representations of Sy, and also 
their degrees. Find them explicitly. 


360 9 Representation Theory of Finite Groups 


9.3.6 Determine the number of irreducible complex representations of A4, and also 
find their degrees. 


9.3.7 Let F be an algebraically closed field of characteristic 0, and G be a finite 
group. Let p and 77 be representations associated to F(G) modules V and W, respec- 
tively. Show that < y,, x, >= Dimp(Hompg)(V, W)). 


9.3.8 LetG =< a > beacyclic group of prime order p. Consider the group 
algebra Q(G). Leto = typo and t = a—c. Show that the following holds. 


(i) 7 =o. 
(ii) Q- a isa subfield isomorphic to Q. 
(iii) (I —o)? = 1-9). 
(iv) Q- (1 —<a) isalso a subfield isomorphic to Q. 
(v) 7 is a root of the irreducible polynomial X?~! + X?~? + ---+ X +1 over 


Ce (l= a 

(vi) The subring F of Q(G) generated by Q- (1 — oc) and 7 is isomorphic to 
Qer). 

(vii) Every element of Q(G) can be written uniquely as sum of an element of Q-o 
and F. 


(viii) Product of any element of Q-o with an element of F is 0. 
(ix) Q(G) is isomorphic to Q x Q(e7 ). 


9.3.9 Determine irreducible representations of a cyclic group of order 3 over Q. 


9.3.10 LetG = H x K be the direct product of finite groups H and K. Let 
F be an algebraically closed field of characteristic 0. Let p and 7 be irreducible 
representations of H and K on vector spaces V and W, respectively. Let p © 77 be the 
representation of G on V @ W defined by (p@n)(h, k)(v@w) = p(h)(v) @n(k)(w). 
Suppose that y and v are also irreducible representations of H and K, respectively. 
Show that 


< Xpan Xpav > = 1, 
ifp = pandyn = v,and 
< Xpan Xnav > = 9, 


otherwise. Deduce that these are irreducible representations, and every irreducible 
representation of G is obtained in this manner. 


9.3.11 Show that the Grothendieck group of the group algebra C(G) is the character 
ring Ch(G) of G. Find the Grothendieck groups of 
C(Zm), C(V4), C(S3), C(Qg) and C(Dg). 


9.4 Induced Representations 361 


9.4 Induced Representations 


Let H be a subgroup of a group G, and p be a representation of a group G. Then 
the restriction of p to H denoted by py is a representation of H. One may observe, 
by means of an example, that the restriction of an irreducible representation need 
not be an irreducible representation(the two-dimensional irreducible representation 
of S3 when restricted to A3 1s not irreducible). 

Now, we describe the adjoint to the restriction. Let H be a subgroup of G, and 
p be a representation of H on W. Then W is a left F(H)-module. Since F(H) 
is a sub-algebra of F(G), we see that F(G) is a bi-(F(G), F(H)) module. Hence 
V = F(G) @F x) W isa left F(G)-module. This gives us a representation of G 
which we denote by p°, and call it the induced representation of G induced by the 
representation p of the subgroup H of G. Let S be a left transversal to H in G. 
Then F'(G) as right F(H)-module can be written as ®Xy<¢5x F(H). Thus, V can be 
written as 


V= @OrXyesx @ W. 


Consider an element x ® w, w € W, x € S in one of the direct summands of V. 
Suppose that gx = yh,h € H andy € S. Then p?(g)(x @w) = g(x @w) = 
gxQ@w = yh@w = y@hw = y@u',wherew’ = hw = p(h)(w). Clearly, 
DimV = DimW -[G: H]. Thus, degp® = degp-[G: H]. If {w, wo,...,w,} 
isabasis of W and S$ = {x1,x2,..., Xs}. Then {x; @w;|1<i<s,1<j <r}is 
a basis of V. The character y,c of G is denoted by nee and it is called the induced 
character. 


Proposition 9.4.1 Let H be a subgroup of a finite group G and F a field whose 
characteristic does not divide | G |. Let p be a representation of H. Then 


xg) = eon age"), 


1 
—» 
| H | 
where x/, = Xp on H and 0 onG — H. 


Proof Let p be a representation of H on W, and {w, w2,..., w,} be a basis of W. 
Let S = {x1,x2,...x;} bea left transversal to H inG. Let V = F(G) @rcHy) W. 
Then as observed, {x; ® w;} form a basis of V. Now, the basis element x; ® w; will 
contribute in the diagonal entry of p° (g) only if gx; = x;h forsomeh € H, and then 
p? (g)(x; ® wj) = x; ® p(h)(w;). Thus, for such a x;, the sum of the contributions 
in the diagonal entries of p°(g) corresponding to the set {x; ® w;,|,1 < j <r} is 
Xp (x; 'gx;) to the diagonal entry. This shows that 


G = y ! -1 Ae es I ! —l1 
Xp (9) = Vu1x,/%; 9x) = TH] UteGXo gx). 


The last equality holds because \, is a class function. ft 


362 9 Representation Theory of Finite Groups 


Theorem 9.4.2 (Frobenius reciprocity Law) Let H be a subgroup of a finite group 
G. Let p be a representation of H, and n be a representation of G over a field whose 
characteristic does not divide | G |. Then 


G 
< Xp» Xn >G =< Xp Xn FH 


where 1 denotes the restriction of n to H, <, >g denote the inner product in F(G), 
and <, > y the inner product in F(A). 


Proof We have 


< Xp1Xn >G= aq e9eG Xe (9)Xn(97") = 
a Baca Gy ZxeGXpX'9)XH(A'I A) = GZ eoX,W XV) = 
TH] She Xp) Xing (A!) — ir Xp» Xnu >H. 


tt 


In practice, to determine irreducible representations of a group G, we look at 
the representations of some special type of subgroups, induce it to G, and then 
decompose it into irreducible representations. 


Remark 9.4.3 Observe that the Frobenious reciprocity holds even if we replace char- 
acters by the class functions. 


Example 9.4.4 Let H be a subgroup of a finite group G. Let S be a left transversal 
to H in G. Let ly denote the trivial representation of H over a field F whose 
characteristic does not divide | G |,and V = @2Xxesx ® F the right vector space 
over F with S$ = {x ® 1 | x € S} as a basis. Then the induced representation he 
is the representation of G on V given by 19(g)(x @1) = gx @1 = yk @1 = 
y@lx(kK)C) = y@1, where gx = yk, y € S,k € H. The character x 1¢ is given 
by xig(9) = tracel1% (qg) =|{xeS | gx = xkforsomek ¢ H}|=|{xeS| 
gx H = xH} | forall g € G. Using the Frobenious reciprocity law, 


< X19, X1g >G =< Xiy> Xig/H >H =< X1g> X1q >H = |. 


It follows that the trivial representation 1g of G appears once and only once in the 
representation of 1% as the direct sum of irreducible representations. More explicitly, 
(1 = |g @sy(G), where s;;(G) is the representation of G with no summands as 
1g. We shall call s7(G) as the standard representation of G induced by the subgroup 
H of G. What is s;-)(G)? Describe the representation s;;(G). Further, 


1 a 
weal ) B9c6X1g DX16(9 y= 1, 


9.4 Induced Representations 363 


In turn, 


1 
caine |{xe S| gxH = xH}|= 1. 
Now, let # be a left transitive action of G on X. Then @ induces a representation p of 
G on the vector space FX over F with X as a basis. If H is the isotropy subgroup 
of the action at a point x; € X, then X can be realized as a left transversal to H in 
G, and the representation p is equivalent to iF. Thus, in this case, 


bees Hix eS gdx = x} {= 1. 

More generally, let G be a finite group which acts on a finite set X through a left 
action 6. Let F be a field whose characteristic does not divide | G |, and V a vector 
space over F' with X as a basis. The action @ of G on X determines a representation 
p of G on V. Let {X,, Xo,..., X,} be the set of distinct orbits of the action. The 
action of G on X induces transitive actions of G on each X;. Further, V = FX = 
FX, ® FX2 ®---: ® FX,, and p induces representations p; of G on FX; for each 
i with p = pi ® p2 ®--- @ p,. Let H; denote the isotropy subgroup of the action 
at a point x; € X;. Then as observed above, p; = iF for each i, and the character 
Xp = %XiXp;- In turn, using the Frobenius reciprocity, 


<Xp Xig >G = Yi <Xiy» Xig >G = Ui < Xin» Xin PH = 
We get 


1 
rej uss | ee 5 | g0x = x} l= 7, 
where r is the number of orbits of the action (see Exercise 9.1.11 of Algebra 1). Also 
note that 10, is the regular action p,e, of G. 


Example 9.4.5 Let G be a finite group which acts transitively on a finite set X 
through a left action 6. Let H denote the isotropy subgroup of the action at a point 
x € X. Then H also acts on X. Let p denote the representation of G associated to 
the action 0. Then p = i. and the representation of H associated to the induced 
action of H on X is the restriction 1% /H. It follows from the discussion in the above 
example that the number r of H-orbits of the action is given by 


1 E 
r=U<xXiS/H> Xin PH =< XS, X1g -E= TE] ns eXe X09 a 
Further, y,(g) = tracep(g) is the number of fixed points of the action of the element 
g on X, and which is the same as the number of fixed points of g~!. This shows that 
Xp(9) = Xp(g"'). Thus, 


364 9 Representation Theory of Finite Groups 


1 
r= ——YZeec(x,(g) = Lyeo(| {x € X | gOx = x} I)’. 


1 
| G | | G | 
Let us further assume that G acts doubly transitively on X. Then the isotropy subgroup 
H of the action at xp € X acts transitively on X — {xo}, and so the number of orbits 
of the action of H on X is 2. From the above discussion, it follows that 


1 
Ta escatl & © X| gx = 4} )? = 2 
(see Exercise 9.1.23, Algebra 1). Also ne = 1g @5y(G), where 1g does not appear 
as a summand in sy(G). Hence 


< Xsu(@> Xsu(G) >G = I. 


This shows that the standard representation s7(G) of G is irreducible provided that 
the action of G on X is transitive as well as doubly transitive (For example, S,, 
or A,,n > 4 acts transitively as well as doubly transitively on a set containing n 
elements). 


Let H and K be subgroups ofa groupG.Asubset KgH = {kgh |k €¢ Kandhe 
H} is called a (K, H) double coset. The set of all (K , H) double cosets will be denoted 
by [K, G, H]. Whatare[{e}, G, H], [H, G, {e}]and[G, G, H]? Itis easily observed 
that [K, G, H] is a partition of G. The set of representatives obtained by choosing 
one and only one member from each (K, H)-double coset is called a double coset 
representative system. For convenience, we choose e to represent double coset K H. 
Let S be a left transversal to H in G. Then[K,G, H] = {KsH | x ©€ S}. Further, 
G and so also K acts on S in a natural manner. It follows that the number of K -orbits 
of this action is precisely the number | [K, G, H] | of (K, H)-double cosets. Using 
the arguments in Examples 9.4.4 and 9.4.5, we see that 


1 
|[K,G, H]|= pK pex lee S| kx = x}|. 


Since k#x = x if and only ifk € x~'!Hx () K, it follows that 


1 el 
|[K, GH] |= eo Exes | H Le 


We state few results due to Brauer and Artin without proof. 


Theorem 9.4.6 (Artin) Every character of G over C is a rational linear combination 
of characters induced from characters of cyclic subgroups. tt 


Theorem 9.4.7 (Brauer) Every character of G is integral linear combination of 
characters induced by one-dimensional characters of subgroups of G. tt 


9.4 Induced Representations 365 


Exercises 


9.4.1 Show that the regular p,., representation of G is the same as the induced 
representation of the trivial representation of the trivial subgroup. 


9.4.2 Let H be a subgroup of G of finite index. Let W be the F-vector space with 
(G/H)! as a basis. Then the action of G on (G/H)! gives rise to a representation of 
G. Show that this representation is the representation induced by the trivial repre- 
sentation of H. When can this representation be irreducible? 


9.4.3, Describe irreducible components of all representations of $3 induced by the 
representations of proper subgroups. 


9.4.4 Call a group G to be a Frobenius group, if it has a proper subgroup H such 
that H ()xHx~! = {e} forall x ¢ G— H. It isa fact, which can be proved using 
induced representation theory, that N = G-—U eo = {e})g"! is a normal 
subgroup, called the Frobenius kernel, such thatG = HN and H(|\N = {e}. 
Show that a finite group G is a Frobenius group if and only if it is a transitive 
nonregular permutation group in which no nonidentity element has more than one 
fixed point. 


9.4.5 Show that D4,+2 is a Frobenius group. 


9.4.6 Let G be a Frobenius group with Frobenius kernel N, and the Frobenius 
compliment H. Show that | H | divides | N | —1. 


9.4.7 Let H be a cyclic group of order ny. Define a map zy from H to Z by 
Ly(h) = ng if h isa generator of H, and 0 otherwise. Show that uy is a class 
function. Letvy = O(14)Xreg — muy, where X;eg is the regular character of 
H (note that vy is zero map on trivial cyclic group). Show that vj, is also a class 
function on H. 


9.4.8 Let G bea finite group of order m. Let x = Xreg — X1,- Using the Frobenious 
reciprocity, show that for any class function 7) on G, 


G 
<mx, 17 >G= YHea <Vyqy,1>G; 


where Q2 is the set of subgroups of G. Deduce thatmy = & eave. 


9.4.9 Let 7 be a degree | character of a cyclic group H. Show that 
<VH, 17 >H = Xnex (1 = n(h)), 


where X is the set of generators of H. Using the fact that 7(/) is an algebraic integer, 
deduce that < vy, 1 >vy is positive integer. 


9.4.10 Using the above exercises, show that y (defined in Exercise 9.4.8) is positive 
linear combination of characters induced by degree | characters of cyclic subgroups 
of G. 


366 9 Representation Theory of Finite Groups 


9.4.11 Let H be a subgroup of a finite group G, and p be a representation of H on 
a finite-dimensional vector space W. Let p° be the induced representation of G. Let 
K be a subgroup of G and S a set of (K, H)-double coset representative system. 
Let H,, denote the subgroup x7! Hx (| K. Let p, denote the representation of H, 
defined by p,(a) = p(xax~') and p* the induced representation of K induced by 
px. Let resx (p°) denote the restriction of p© to K. Show that 


resx(p°) = ®Xresp*. 


9.4.12 Refer to Exercise9.4.11 with K = H. Show that p° is irreducible if and 
only if pis irreducible, and < Xp,, Xresy,(p) > = 0. This result is termed as Mackey 
irreducibility criteria. 


Chapter 10 
Group Extensions and Schur Multiplier 


The Chap. 8 was devoted to the field extensions and Galois Theory. This chapter cen- 
ters around the study of group extension and Schur multiplier. The guiding problem 
in Group Theory is to classify groups up to isomorphisms. The solution, in general, is 
beyond the dream to mathematicians. However, mathematicians always roam around 
this problem. Let us restrict our self to the problem of classifying finite groups up- 
to isomorphisms. Every finite group has a composition series, and the composition 
length is an invariant of the group. If 


G = G,>G2eP-:-b Gab Grit = {e}. 


is a composition series of G, then G;/G;+, is a finite simple group for each i. As 
such, the problem of classifying finite groups reduces to the following two problems: 

1. Classify all finite simple groups. 

2. Given a finite group H and a finite simple group K, to classify all groups G 
(up-to isomorphism) having H as a normal subgroup such that G/H is isomorphic 
to K. 

Finite simple groups have been classified. They are of the following four types: 


(i) Prime Cyclic groups. 

(ii) The alternating groups A,,n > 5. 
(iii) Finite simple groups of Lie types such as PSL(n, q). 
(iv) 26 Sporadic simple groups. 


The reader is referred to the book “Finite simple groups: An introduction to their 
Classification, by D. Gorenstein” for their detailed description. 

The solution to the problem 2 is still beyond the dream to mathematicians, and it is 
addressed in the theory of extensions of groups and co-homology theory of groups. 
In this chapter, for convenience, we may frequently use the language of category 
theory. The reader may refer to the appendix of the Algebra | for the purpose. 


© Springer Nature Singapore Pte Ltd. 2017 367 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0_10 


368 10 Group Extensions and Schur Multiplier 


10.1 Schreier Group Extensions 


In this section, we shall describe Schreier theory of group extensions. 
A sequence 


On—2 On-1 a Ant 
> Gay > Gna > Gar > 


of groups G,, together with homomorphisms a, is said to be an exact sequence at 
G,, if image Q,-; = kera,. The sequence is said to be exact if it is exact at each 
Gy. 

A finite term exact sequence of the type 


(== FS 62 es i 


with | representing the trivial group is called a short exact sequence. Thus, to say 
that the above sequence is exact is to say that a is injective, 3 is surjective, and 
image a = ker (3. In particular, a(H) is a normal subgroup of G such that 3 
induces an isomorphism from G/a(H) to K. The above short exact sequence is also 
termed as an extension of H by K. By the abuse of language, we also say that G is 
an extension of H by K. 


Example 10.1.1 For any positive integer m, we have the short exact sequence 


(0) => a2 SS ZS Zy — (0), 


where i is the inclusion map, and v is the quotient map. Thus, this is an extension of 
mZ by Z,,. We have another extension of mZ by Z,,, given by the short exact sequence 


{0} —> mZ 4 mZ@Zm 7 Zm —> {0}, 


where i; is the inclusion in the first component, and p> is the second projection. Note 
that Z is not isomorphic to mZ @ Zy. 


Example 10.1.2 We have the exact sequence 


{0} —> A3 


> {1,-1} > {0}, 


where i is the inclusion map, and x is the alternating map. Note that A3 is a cyclic 
group of order 3, and {1, —1} is cyclic group of order 2. We have another extension 
of acyclic group of order 3 by a cyclic group of order 2 given by the exact sequence 


(0) —e-Fs 2 He Sy se OY, 


where Z;3 is included as a cyclic group of order 3 in Ze, and v is the corresponding 
quotient map. Note that $3 is not isomorphic to Ze. 


10.1 Schreier Group Extensions 369 


Example 10.1.3 Let G be a group, 7 the natural map from G to Aut(G) given by 
"(g) = f, the inner automorphism determined by g (f(x) = gxg_'), and v the 
natural quotient map from Aut(G) to Out(G). Then the sequence 


{e} —> Z(G) SG Aut(G) —> Out(G) —> 1 
is an exact sequence. 
Example 10.1.4 The sequences 


O30 Zot Ss 7 wo 


and 
Pitp2 


‘Oj = 7 SP 7a Zs 10). 


where (i;, 0) is the first inclusion given by n +> (n, 0), pz is the second projection, 
(ij, —iz) is the map given by n +> (n, —n), and p; + pz is the map given by (n, m) b> 
n-+m, are short exact sequences. As such, both are extensions of Z by Z. Note that 
the middle term is also same. 


Let EXT denote the category (see appendix of algebra | for the notions in category 
theory) whose objects are short exact sequences 


io eG eR 3 4 


of groups, and a morphism between two extensions E; and E> given by the short 
exact sequences 


and 


is atriple (A, 44, v), where \ isahomomorphism from H, to M2, zis ahomomorphism 
from G, to G2, and v is a homomorphism from K, to K2 such that the following 
diagram is commutative: 


ay Bi 
l1_wtH+_,4 A —_,G,; —__,k; —__, 1 


| a4 


1+, HH, __,G, __,kh —_, 1 


370 10 Group Extensions and Schur Multiplier 


The category EXT is called the category of Schreier extensions of groups. The 
isomorphisms in this category are called the equivalences of extensions. 


Theorem 10.1.5 (Five Lemma) Consider the following commutative diagram 


Qa) a2 a3 Q4 
| fi | h | f | fa | fs 
fe (Bo 3 D4 
A, —————— A —_—_>- A —_\_—- 4 —_——_>- Hs, 


where rows are exact sequences of groups, and the vertical maps are homomorphisms. 
(i) If f, is surjective and fr, f4 are injective, then f; is injective. 
(ii) If fs is injective and fy, f2 are surjective, then f; is surjective. 
(iii) If fi, fo, fa, fs are isomorphisms, then f; is also an isomorphism. 


Proof (i). The proof is the imitation of the proof of the five lemma (Theorem 7.2.3) 
for modules. However, we repeat those arguments again. Suppose that /; is surjective, 
jo and f4 are injective. We have to show that f; is injective. Let g3 € G3 such that 
fs(93) = e (e will denote the identity of all the groups under consideration). Then 
e = 23(e) = (23(f3(93)) = fala3(g3)) (from the commutativity of the diagram). 
Since f4 is injective, a3(g3) = e. Thus, g3 € keraz = imageap (exactness), and 
hence there is an elementg2 € G2 such that a2(g2) = g3. Further,e = f3(g3) = 
f3(a2(g2)) = (B2(f2(g2)) (commutativity of the diagram). Thus, fo(g2) € kerf. = 
image}; (exactness). Hence, there exists h,; € H, such that G\(h,) = f2(g2). Since f; 
is surjective, there is an elementg,; € G, such that f;(g;) = ,. Now, fa(ai(g1)) = 
Ait) = Pith) = fr(g2). Since fo is injective, aj(g;) = go. But, already 
Q2(g2) = g3. Hence g3 = ar(ai(gi)) = e (for imagea; = Keraz). This shows 
that f3 is injective. 

(ii). Suppose that f5 is injective, f2 and fy are surjective. Let hz © H3. We have 
to show the existence of a g3 € G3 such that f3(g3) = h3. Now, (3(h3) € Ay. 
Since f4 is surjective, there is an element g4 € G4 such that f4(g4) = (3(h3). Now, 
fs(a4(g4)) = Ga(f4(g4)) (commutativity of the diagram)= 34((33(h3)) = e (exact- 
ness). Since fs is injective, a4(g4) = e. Thus, g4 € kerag = imagea3. Hence there 
is an element g3 € G3 such that a3(g3) = ga. Since 33(f3(93)) = fa(a3(g3)) = 
fi(ga) = (3(h3), 23(h3(f3(g3)) |) = e. Thus, h3(f3(93))! € kerB3 = imager. 
Hence there exists hy € H> such that 32(hy) = h3(f3(g3))~!. Since fo is surjective, 
there is an element gz € G2 such that fo(g2) = Im. Now h3(f3(g3))! = b(n) = 
Ba(f2(g2)) = f3(A2(g2)). This shows that f3(a2(g2)g3) = hz, and so fs is surjective. 

(iii). Follows from (i) and (ii). tt 


Corollary 10.1.6 Let the triple (A, , v) be a morphism between two extensions E\ 
and Ex. Suppose that X and v are isomorphisms. Then 1 is also an isomorphism. tt 


10.1 Schreier Group Extensions 371 


Corollary 10.1.7 The triple (A, [4, v) is an isomorphism in the category EXT if and 
only if \ and v are isomorphisms between corresponding groups. tt 


Now, we shall give another description of an equivalence class in this category. 
Let 


a B 


1 >H> GokK a (10.1) 


be an extension of H by K. Since (3 is surjective, there is a map ¢ (not necessarily 
a homomorphism) from K to G with t(e) = e (called a section or a transversal) 
such that Got = Ix (note that we are using the axiom of choice). a(H) = ker{ is 
a normal subgroup of G. Thus, for each x € K and h € H, t(x)a(h)t(x)~! belongs 
to a(H). Since a is injective, there is a unique element o/.(h) in H depending on x 
and h such that 

t(x)a(h)t(x)! = a(ai(h)) -++++: : (10.2) 


This gives us a map o/ from H to H given by (10.2). Suppose that of (4,) = o/ (hz). 
Then t(x)a(hy)t(x)"! = a(ot(hy)) = alol(h2)) = t(x)a(h2)t(x)~!. Hence 
a(h,) = a(hz). Since a is injective, hy = ho. This shows that o/ is an injec- 
tive map from H to H. Next, let h e H. Then there is an element a € H such 
that t(x)~!(a(h))t(x) = a(a). Now, t(x)a(a)t(x)~! = a(h). By the definition 
o\(a) = h. This shows that of is also a surjective map from H to H. Again, 
a(ol(hihz)) 
= t(x)a(hyhy)t(x)! 
= t(x)a(hya(hg)t(x)! 
= t(x)a(hy)t(x)“!t(x)aha)t(a)! 
= a(oi(y))a(o4 (hz) 
= a(oi(hy)o4 (hp). 
Since a is injective, o.(Ajhy) = o1(h)o}.(h2). This shows that of is an automor- 
phism of H. 

Thus, given an extension 


of H by K, every section t from K to H determines a map o’ from K to Aut(H) given 
by the Eq. 10.2 (note that o’ depends on the chosen section f). Since t(e) = e, 


Pe eee (10.3) 
Further, G@¢@)t(Qy)) = B(EX))8CQ)) = xy = BUt@y)). Hence (t(x)t(Q)) 


(t(xy))~! belongs to ker? = image. Thus, there is a unique element f'(x, y) € H 
depending on f, x and y such that 


t(x)t(y) = a(f'(x, y))tGy) +++ . (10.4) 


372 10 Group Extensions and Schur Multiplier 


Again, as t(e) = e, 


fey == fe «o (10.5) 


for allx,y eK. 
For x,y,z € K, 


(t(x)t(y))t(z) = af’ (x, y))t@y)t(z) = a(f'(x, y)a(f' xy, z))t((xy)z) 
a(f' (x, yf" (xy, z))t((ay)z) 


On the other hand, 


t(x)(tQ)tZ)) = taal’, z)tQz) = twa(f'(y, z))t@)7!t@)tQz) 
aol (f'(y, af! (x, yz) t(xQz)) = alol(f'(y, z))F'@, yz))t(x(yz)) 


Equating both the expression for t(x)t(y)t(z), we find that 
a(f'(x, yf (xy, 2) = alo’, 2)f'(@, yz). 
Since a is injective, 
Sf, WF OY, 2) = OPO, DS YZ) vr (10.6) 
Next, forx,y ¢ K andhe A 
(t@)tQ))ath) = a(f'(x, y))t(xy)a(h) = 
a(f'(x, y))taya(h)tixy) tay) = a(fi(x, yaa, (Atay) = 
a(f'(x, yot,(h))tay). 
On the other hand, 
tx)(t@)a(h)) = ta)t@ahtQ)y"tQ)) = t@)(a(a}(h))tQ)) = 
t(x)(a(oi(h)) tx) "ta)t(y) = alol(oi(h))a(f'(x, y))tay) = 
(ai (oy(h) fx, y) ty). 


Equating both the expression for t(x)t(y)a(h), and using the injectivity of a, we find 
that 


Fa yoty(h) = of(ol (Afi y) - (10.7) 


We are prompted to have the following definition. 


10.1 Schreier Group Extensions 373 


Definition 10.1.8 A Quadruple (K,H,0o,f), where K and H are groups, o a map 
from K to aut(H), and f a map from K x K to H, is called a factor System if the 
following conditions hold: 

(i) oc = Jy (For convenience we denote the image of x under the map a by o,). 
(ii) f(x, e) = 1 = f(e, y) for all x, y € K, where 1 denotes the identity of H, and 
e denotes the identity of K. 

(ii) FX, yf ay, z) = orf, ZF, yz) for all x,y,z € K. 

(iv) f(%, y)Oxy(h) = ox(o,(A)) f(x, y) for all x,y €¢ K andh e H. 


Remark 10.1.9 The condition (iii) can be viewed as the non-abelian version of a 2 
co-cycle. 


The proof of the following proposition follows from the discussions which moti- 
vated the Definition 10.1.8. 


Proposition 10.1.10 Every extension E of H by K with a choice of a section t 
determines a factor system Fac(E, t) = (K, H,o', f'), where o' and f' are described 
by the Eqs. (10.2) and (10.4) above. This factor system is termed as a factor system 
associated to the extension E. tt 


Conversely, we have the following proposition. 


Proposition 10.1.11 Let (K, H, o,f) be a factor system. Then there exists an exten- 
sion E of H by K, and a section t of E such that Fac(E,t) = (K,H,o,f). 


Proof LetG = H x K. Define a product - in G by 


(a,x) -(b,y) = (aox(b)f (x, y), xy). 


Using (i) and (ii) in the definition, (1, e)-(b,y) = Ca(b)f(e, y), ey) = (b,y), 
and also (a, x)-(1,e) = (ao,(1)f (x, e), xe) = (a,x). This shows that (1, e) is the 
identity of G. Let (a,x) € G. To find the inverse of (a, x), we have to find a (b, y) 
so that (1,e) = (a,x)-(b,y) = (ao,(b)f (x, y), xy). Obviously, then, y = x7!, 
and b should be such that ao,(b)f(x, x7!) = 1. Since o, is a bijective map on H, 
b = of! (a 'f (x, x!)~!). More precisely, (a7! (a7! f (x, x7')~!), x7!) is the inverse 
of (a, x). Finally, to ensure that G is a group, we have to establish the associativity 
of -. Now, 


((a, x) - (b, y))- (2) = (aox(B)F, y), xy) (2) = 
(aox(b)f (X, Y) Oxy (Of AY, 2), Ory)z). 


On the other hand, 


(a, x) - ((b, y)- (€,2)) = (a,x) + (boy (Ef, 2), yZ) 
(ao, (boy (cf (y, Z))f (x, yz), x(yz)). 


Since the multiplication in K is already associative, to ensure the associativity in G, 
we need to show that 


374 10 Group Extensions and Schur Multiplier 


aox(b)f (X, Vox (Of ay, z) = aox(boy(c)f (y, ZF (x, yz) 


for all a, x, b, y, c, z. Since o, is an automorphism, we need to verify that 


£%, Woxuyl(Of ay, 2) = ax(ay(c)oxcF OY, ZF, yz) 


Using the (iii) part of the Definition 10.1.8, the RHS is transformed to 


ox(ay(c))f (x, y)f (xy, z). 


Hence we need to verify that 


f@, Y)Oxy(C) = ax (ay(c))f (x, y) 


This is true because of the part (iv) of the Definition 10.1.8. Thus, G is a group 
with respect to the product - defined above. The map a from H to G defined by 
a(h) = (h, e), and the map 7 from G to K defined by G(a, x) = x are easily seen 
to be homomorphisms. In turn, we get the extension 


Qa 


BS toes eS ee 


of H by K. Consider the section ¢ of E given by t(x) = (1, x). Then ¢(x)- tiv) = 
(x)-(Ly) = (@y).ay) = Fy). 2-(,ay) = a(f@,y))- tGy). This 
shows that f‘(x, vy) = f(x, y) for all x, y € K. Thus, f' = f. Again, t(x)- a(h)- 
t(x)7! — (1,x)- (A,e)-(,x)7! — (a, (h)f (x, e), x) « (ox) (F(x, 
oy 1 = Gnas) fee ),2) = G),e) = oO) (note 
that (1,x)~! = ((o,)~!(f (x, x7!))7!), x7})). This shows that o/(h) = o,(h) for 
allx € K andh € H. It follows thato’ = oandf’ = f. Thus, Fac(£, 1) is the given 
factor system (K, H, o,f). t 


Now, we describe the category EXT of extensions as the category of Factor 
systems. Let (A, ju, v) be a morphism between the extensions £| and E2 given by the 
following commutative diagram: 


a Bi 
1_ie=s, A, __,G,; —__,k; __, 1 


om) Bo 
1 Sat An —_+G2 —_ >k, —“— 1 
Let t; be a section of E;, and t) be a section of Ey. Let (K,, Hj, 0",f") and 


(K2, Ho, o”, f) be the corresponding factor systems. Let x € K;. Then p(t)(x)) € 
Gy and (o(u(ti(x))) = “UCGi(i@))) = va) = pro(r(V@))). Thus, u(t) (2 


10.1 Schreier Group Extensions 375 


(v(x)))"! € ker, = imageay. In turn, we have a unique g(x) € H> such that 
a2(g(x)) = w(t (x))(t2(v(x))) |. Equivalently, 


M(t (x)) = a2(g@))n(y@)) +--+ . (10.8) 
Since t\(e) = e = f(e), it follows that 
Ge Ml gintes (10.9) 
Now, using the commutativity of the diagram and the Eq. 10.8, we have 


Ma @Ady)) = pla", yy) = aA", yu @y)) = 
ar(A(f" (x, y))a2(gay))b(Vay)) = aA", y))a2(gay))n(VO)vQ)) 


On the other hand, since yz is ahomomorphism, using again the Eq. 10.8, 


MeO) = waA@)) HOO) = ar(g@)oVU@)ar(IgVnvO) = 
ar(gx))r(vx))a2(9Q))(2(V@)))'h(VO)hHVO)) = 
a2(9x))a2(O%) GO) I2(V@)H(VO)) = 
a2(g(x))a2 (oF (GO) Jarlf? (Va), vY)))U@)v@)). 


Equating the two expressions for p(t) (x)t1 (y)), and observing that az is an injective 
homomorphism, we obtain the following identity: 


AF", VGA) = IM oh (GON? UC, vO) + (10.10) 
Further, by Eq. 10.2, 
n(xar (h(i)! = ar(oh (h)). 


Applying the homomorphism jz on the above equation, and using the commutativity 
of the diagram, we get 


[u(ty (x))a2(ACh)) (H(t)! = a2 (A(ot (A). 
Using the Eq. 10.8, 
0 (9(x))2(Vx))a2(A(h)) (b(Y@)))a2(9@)') = a2(A(ol (A). 
Using the Eq. 10.2 for the extension E>, we get 


an (g(x))02(7 7.) (ACA)))a2(gX)') = an(At(h)). 


376 10 Group Extensions and Schur Multiplier 


Since a is an injective homomorphism, 
GD A) GA)! = ACTA) --- (10.11) 


Thus, a morphism (A, pi, 7) between extensions £, and E> together with choices 

of sections t; and f) of the corresponding extensions, induces a map g from K, to 

Hy such that the triple (v, g, X) satisfies (10.9), (10.10) and (10.11), and it may be 

viewed as a morphism from the factor system (K, H,, 0", f") to (Ky, Ho, 0”, f?). 
Let (A;, 141, 4) be a morphism from an extension 


to an extension 


a2 Bo 


Ey =1 > Hy + Go > ko > I, 


and (Ap, [2, 2) be that from the extension E> to 


E; = 1 > As ae G3 ae > 1. 
Let t;, f2 and f3 be corresponding choice of the sections. Then as in (10.8) 
Hi(ti(x)) = a2(gi(x))h(1(@)). 


and 


L2(t2(u)) = 03(g2(u))t3(v2(u)), 


where g; is the uniquely determined map from K, to Hp, and gp is that from K2 to 
HA. In turn, 


bo (ei(t1(x))) = p2(a2(gi(®))) w2(h(1(%))) = 
H2(a2 (91 &)))03(g2(11 (x) B21) = 
a3 (A2(91 (*)))03(g2(11&)))B(2010))) = 03(93(4))B(Y2(1 (X))). 


It follows that the composition (Az 0 Xj, [2 o 41, V2 0) induces the triple (1 0 
V1, 93, A2 0 Ay), Where g3(x) = A2(gi(x)) 9201 x) for each x € Ky. 

Prompted by the above discussion, we introduce the category FACS whose objects 
are factor systems, and a morphism from (K,, H,, a! a to (Ko, Hb, a, f*) is atriple 
(v,g, A), where v is a homomorphism from K, to Kz, \ a homomorphism from H; 
to Hz, and g a map from K, to HM such that 
(i) gfe) = 1 
Gi) AF. Y)gOY) = IM; GON? U, vO)) 

(iii) g(x)o2,, AM) g@)-! = ACL(A)) 
The composition of morphisms (1, g;, 1) from (K;, Hi, o!, f!) to (Kz, Hz, 0”, f”) 
and the morphism (v2, g2, A2) from (K2, Ho, af) to (K3, H3, es) is the triple 


10.1 Schreier Group Extensions 377 


(V2 01, 93, A2 © Ay), Where g3 is given by g3(x) = go(™(x))A2(g1(x)) for each 
xe Ky. 
The following theorem is the consequence of the above discussion. 


Theorem 10.1.12 Let tg be a choice of a section of the extension E of a group by 
another group (such a choice function t exists because of axiom of choice). Then the 
association Fac which associates to each extension E the factor system Fac(E, tr) 
is an equivalence between the category EXT of extensions to the category FACS of 
factor systems. tt 


Fix a pair H and K of groups. We try to describe the equivalence classes of 
extensions of H by K. Let G be an extension of H by K given by the exact sequence 


| ee: ae eer 


Let (A, 14, v) be an equivalence from this extension to an other extension G’ of H’ 
by K’ given by the exact sequence 


B 


Peis vWwa@s4 ei 


Then it follows that G’ is also an extension of H by K given by the exact sequence 


a’oX 


yo , vlop" 
Ev =1 > H+ G > K > J 


such that (J, yu, Ix) is an equivalence from E to E”. Also, (A, Ig, v) is an equivalence 
from E” to E’. 

As such, there is no loss of generality in restricting the concept of equivalence on 
the class E(H, K) of all extensions of H by K by saying that two extensions 


EL, =1—-H 


and 
Bei 7S 6G; 5 kK — 1, 


in E(H, K) are equivalent if there is an isomorphism ¢ from G, to G2 such that the 


diagram 
Ty | | Ik | 
a2 Bo 


1 > A _»G, __,»k __, 1 


378 10 Group Extensions and Schur Multiplier 


is commutative. Indeed, for any extension E in EXT which is equivalent to a member 
E' of E(H, K), there is a member E” of E(H, K) such that E is equivalent to E” in 
the category EXT and E" in E(H, K) is equivalent E’ in the sense described above. 
Let 
a B 
E=1 > Ha GoK > 1. 


be an extension of H by K. Let t be a section of the extension. Then f induces a map 
o' from K to Aut(H) as described by the Eq. 10.2. In turn, it induces a map Y;, from 
K to Out(H) given by W(x) = of Inn(H). Let 


a’ 


EF’ =1 > H ee 4 > 1. 


be another equivalent extension in E(H, K). Let Uy, ®, Ix) be an equivalence from 
E to E’, and ¢’ be asection of E’. It induces an equivalence (x, g, [7) from the factor 
system (K, H, o', f') to the factor system (K, H, 0", f"), where g is a map from K 
to H. From (10.10) and (10.11), we have 


f' @ yg@y) = g@o? (gO) Ff? y) ve (10.12) 


and 
g(x)at ((h)) g(x)! = at (h)) +--+ (10.13) 


for all x, y € K andh e€ H. Thus, o/. and ot differ by an inner automorphism of H. 
Hence Wi(x) = ot Inn(A) = of Inn(H) = wr (x) for each x € K. This shows that 
the map Y;, from K to Out(H) is independent of a representative E of the equivalence 
class, and also independent of the choice of a section t. Thus, without any ambiguity 
we can denote Yj, by Wz), where [E] denote the equivalence class determined by 
the extension E. Further, from (10.7), it follows that o/, and o%, o 0}, differ from an 
inner automorphism of H. Hence Wjg\(xy) = Wz] (x) Ye) (y) for all x, y € K. 


Definition 10.1.13 A homomorphism from K to Out(H) = Aut(H)/Inn(HA) is 
called a coupling or an abstract kernel of K to H. 


We have established the following theorem: 


Theorem 10.1.14 Let Ext (H, K) denote the set of all equivalence classes of exten- 
sions in E(H,K). Then there is a natural map V from Ext (H, K) to the set 
Hom(K, Out(A)) of all abstract kernels (couplings) of K to H given by V([E]) = 
Wiz as defined above. 


The map W described in the above theorem is called the abstract kernel map or 
the coupling map. 


Example 10.1.15. The abstract kernel map W need not be injective. In other words, 
two non-equivalent extensions H by K may induce same abstract kernels of K to H. 
For example, consider the extensions 


10.1 Schreier Group Extensions 379 


E, = 0} —z “ zez 3 Z — } 


and 
m3 


> Z > Z, — {0} 


E, = {0} —Z 


of the group Z by Z3, where m3 is the multiplication by 3, and v is the natural quotient 
map. These two extensions E, and E> are not equivalent as Z @ Z; and Z are not 
isomorphic. Since Out(Z) is a group of order 2 and Zz is a group of order 3, the 
only abstract kernel of Z; to Z is the trivial map. As such, [E,] 4 [£2], where as 
W([E\]) = Y((ED)). 


We shall see that the map W may not be surjective also. Indeed, we have two basic 
problems in the theory of extensions of groups. 

1. To determine the abstract kernels 7 € Hom(K, Out(H)) which are realizable 
from an extension E of H by K in the sense that W([E]) = 7. 

2. Given an abstract kernel 7 € Hom(K, Out(H)) which is realizable from an 
extension, to determine and classify all extensions E' up to equivalence such that 
W(E) = 7. Such abstract kernels are call couplings. 


Theorem 10.1.16 Let H be a group with Z(H) = {e}. Then the map WY from 
Ext (H, K) to the set Hom(K, Out(A)) is bijective. More explicitly, every abstract 
kernel n of K to H determines and is determined uniquely by an equivalence class 
of extensions in Ext (H, K). 


Proof Let» € Hom(K, Out(A)) be an abstract kernel of K to H. Consider the Pull 
Back Diagram 


Aut(H) ____, Out(H) 


More explicitly, G is the subgroup of the direct product Aut(H) x K givenby G = 
{(a,x) | o € Aut(H) and oInnH = n(x)}, p, the first projection and p> the second 
projection. Clearly, p2 is a surjective homomorphism from G to K. The kerp2 = 
{(a,e) | olnnH = 7(e) = Inn(A)} = Inn(A) x {e}. Since the center Z(H) of 
#7 is trivial, the map a from H to G defined by a(h) = (iy, e) (i, denotes the inner 
automorphism determined by h) is an injective homomorphism with imagea = 
kerp2. This gives an extension E of H by K given by the exact sequence 


Rei a7 26S. xk =. £ 


Using the axiom of choice, there is a map € from K to Aut(H) such that €(«)Jnn 
(H) = (x). This determines a section t of the extension E given by t(x) = (&(x), x). 


380 10 Group Extensions and Schur Multiplier 


Recall that the abstract kernel Y(E) associated to the extension FE is given by 
W(E)(x) = olInn(A), where o¢ is given by (see Eq. 10.2) 


t(x)a(h)t(x)"! = a(ol(h)). 


Now, 
a(ai(h)) = t(x)a(h)t(x)7! = (E(x), x) (in, e) (E(x), x) 7! = 
(E(@)in(E@)) 1,0) = (ica, e) = a(E(X)(A)). 


Thus, of(4) = &(x)(A) for all he H. In turn, of = €(x). By the definition, 
W(E)(x) = ofInn(A) = E@)Inn(A) = n(x) for all x € K. This shows that 
W(E) = 7, and so W is surjective. 

To prove the injectivity, suppose that YV(E,;) = W(E2), where FE; and £2 are 
extensions of H by K given by 


E, = 1 > 


and 
v2 Bo 
Beat > 7 SG Sk — > ft, 


Let t; be a section of FE, with corresponding factor system (K, H, o", f"), and tp 
be a section of E, with the corresponding factor system (K, H, 0”, f). Under our 
assumption 

o!' Inn(H) = W(E\)(x*) = V(Ex)(x) = o? Inn(H) 


for all x € K, where o/! and a? are given by the equations 


(xan (Ayia)! = ay(oi(A)). 


and 
tr(x)ar(h)ir(x)' = a2(of(h)). 
Since of! Inn(H) = o?Inn(A) for all x € K, and since H is center less, there is a 


unique map g from K to H such that 


a Sige? sae (10.14) 


x 


for all x € K. Again by (10.7), we have, 


FG, Wayh) = of (os DFG y) ve (10.15) 


and 


FP, yog(h) = of (oP (A))FA(, y) + (10.16) 


10.1 Schreier Group Extensions 381 
Using (10.14), (10.15), and (10.16), we get 


f'@ ygey)oR.()giy) | 

f" (x, yo) 

= ah (ol (hf, y) 

g(x)o? (oi (A) gx) !f"(@, y) 

= go? goo? hg) ga) 'f" Ox, y) 

g(x)o2 (go? (OP (A))o2 (GQ) gx) "FG. y) (10.17) 


In turn, 

a2 (gy) ')g@)'f" @ vygay)ok (hy (a2 (gO) ge) f" @ yg@y))! 
= o(0"(h)) 

= fx, Yo8 (AV(F2(@, 9)“ 


for all h € H. Since o% is a bijective map on H, and H is center less, 


fy) = oO) g@) fF" @, yey), 
or equivalently, 


f' @ ygay) = g@oP gO) FP Gy) oi (10.18) 


The Eqs.(10.14) and (10.18) tells that Wy, 9, 7x) is an isomorphism from the 
factor system (K,H,o",f") to (K,H,o”,f?) in the category FACS. From 
Theorem 10.1.12, E; is equivalent to Ep. tt 


Indeed, the proof of the above theorem establishes the following more general 
result. 


Proposition 10.1.17 Let E, and E be extensions of H by K with V(E\) = WV(E2). 
Then the following induced extensions E\ and E‘, of H/Z(H) by K given below are 
equivalent: 


EL =1— A/Z@) 3 Gi/az@) 3 K > 1, 
Bo=1— H/ZD 3 Go/on(za@n) & K — 1. tt 


Split Extensions, Semi-direct Products 


Definition 10.1.18 An extension 


ReaHi-sA7AS6¢S kK = 1, 


382 10 Group Extensions and Schur Multiplier 


of H by K is called n split extension, if there is a section ¢ which is ahomomorphism. 
Such a section ¢ is called a splitting of the extension. The corresponding factor system 
(K, H, o', f') is such that f’ is trivial in the sense that f‘(x, y) = 1 forallx,y € K, 
and then o’ is ahomomorphism from K to aut(A). 


The Example 10.1.2 (both the extensions), and the Example 10.1.4 (both the exten- 
sions) are split extensions, whereas the extension 


{0} > mZ i — Lin > {0} 


is not a split extension as the only homomorphism from Z,, to Z is the zero homo- 
morphism. 

Recall that a group G is said to be the semi-direct product of a normal subgroup 
H of G with a subgroup K of G if 
(i) G = HK, and 
(ii) H(\K = {e}. 


Symbolically, we write itas H > K. 


Proposition 10.1.19 Let 


PHeison Peles ex. 


be a split extension of H by K with a splitting t. Then G = a(A) > t(K) is the 
semi-direct product of a(H) with t(K). Conversely, ifG = H > K, then there is a 
natural projection p from G to K such that 


1 oH SCS k > 1. 


is a split extension of H by K. 


Proof Suppose that E is a split extension with splitting t. Clearly, a(H) = kerf is 
a normal subgroup of G. Let g € G. Then B(gt(B(g7'))) = Big) BU(B(g7'))) = 
B(g)B(g"!) = e. This shows that gt(G(g~!)) € ker8 = imagea. Hence there is 
a unique h € H such that gt(3(g~')) = a(h). In turn, g = a(h)t(G(g)). Thus, 
G = a(A)t(K). Since ¢ is an injective homomorphism, t(K) is a subgroup of G 
isomomorphic to K. Let a(h) € a(H) ()\t(K), h € H. There is ak € K such that 
a(h) = t(k). But, thene = G(a(h)) = B(t(k)) = k. Since t isa homomorphism, 
e = t(e) = t(k) = a(h). This shows that a(H) ()t(K) = {e}. 

Conversely, suppose that G = H > K. Every element g € G is expressible as 
g = hk, where he H and k eK. Suppose that hjkj = hok2. Then hy'hy = 
koky! €H()\K = {e}. This implies that hy = hy and kj = kp. Thus, every 
element g € G is uniquely expressible as g = hk, where h € H andk € K. This 
gives us a surjective map p from G to K given by p(g) = k, where g = hk. Also 
(hy ki) (hok2) = hykihok, kiko, where hikihok;' € Handk,k2 € K. It follows that 
p is a surjective homomorphism from G to K with kernel H. We get the extension 


10.1 Schreier Group Extensions 383 


1 (7 Gk ae 
of H by K with the inclusion map from K to G as splitting. tt 


Following are some applications of the above results. 


Definition 10.1.20 A group H is called a complete group if the homomorphism 
h +> in (i, being the inner automorphism determined by h) is an isomorphism from 
H to aut(H). More explicitly, the center Z(H) of H is trivial, and all automorphisms 
of H/ are inner. 


Example 10.1.2] There are many complete groups. The symmetric group S,, n 4 6, 
the group aut(G) of automorphisms of a non-abelian simple group G (or more 
generally, automorphism groups of direct products of non-abelian simple groups) are 
all complete groups. Let H be a cyclic group of odd order. Consider the symmetric 
group Sym(H) on the set H. Let p(H) denote the image of the Cayley representation 
of H in Sym(H). Then the subgroup G of Sym(H) generated by p(H) J aut(H) is 
also a complete group. We shall give proof of some of them. 


Proposition 10.1.22 Let H be a complete group. Then any extension of H by K is 
equivalent to direct product extension 


i AS Hee 3S k= i, 


where i, is the inclusion h +> (h, 0), and po is the second projection. More explicitly, 
if G is any group containing H as anormal subgroup, then there is a subgroup K of 
G such that G is direct product of H and K. 


Proof Since H is complete, the center Z(#) of H is trivial, and also Out (#1) is trivial. 
Thus, for any group K there is only one abstract kernel of K to H which is the trivial 
homomorphism from K to Out(#). It follows from the Theorem 10.1.16 that there is 
only one extension of H by K (up to equivalence) which, of course, is the one given 
in the proposition. tt 


Conversely, we have the following result due to Baer. 


Proposition 10.1.23 (Baer) Let H be a center less group. Suppose that for any 
group K, there is only one extension (up to equivalence) of H by K, then H is a 
complete group. 


Proof Let H be acenter-less group such that all extensions of H are direct product 
extensions. Let a € Aut(H). We wish to show that a is an inner automorphism of H. 
Consider the symmetric group Sym(#) of permutations on the set H. For h € H, let 
L;, denote the left multiplication by h on H. The map y from H to Sym(#) defined 
by x(h) = Ly is an injective homomorphism from H to Sym(H). Let G denote the 
subgroup of Sym(H) generated by y(H) {a}. Let Ly € x(H). Then a o Ly 0 at = 


384 10 Group Extensions and Schur Multiplier 


Lon) € X(H). This shows that the subgroup .(#) is a normal subgroup of G, and 
we have an extension 


Lesa eG > Gam — tf. 


By our assumption, this is a direct product extension. Hence there exists a subgroup 
K of G isomorphic to G/y(H) such thatG = \(H) @ K. As such, elements of y(H) 
will commute with elements of K, and every element of G is uniquely expressible as 
product of an element of .(#) and an element of K. Suppose thata = Lyk, x € H 


and ke K. Now, Logs = algar’ = Ijkyk“(,4y)' = Layla = Bae. 
Since y is injective, a(h) = xhx~! for all h € H. This shows that a is the inner 
automorphism determined by x. ft 


Following is an other characterization of a complete group. 


Proposition 10.1.24 A group H is a complete group if and only if it has a charac- 
teristic subgroup K with trivial centralizer in H such that all the automorphisms of 
K are induced by inner automorphisms of H. 


Proof Suppose that H is complete. Then we can take K = H, which is a character- 
istic subgroup of H, and since H is complete, every automorphism of H is an inner 
automorphism of H. Also {e} = Z(H) = Cy(H). Conversely, let H be a group with 
a characteristic subgroup K whose centralizer Cy (K) in H is trivial, and all of whose 
automorphisms are those induced by inner automorphisms of H. Since Cy(K) is 
trivial, H is center less. It is sufficient (Proposition 10.1.23), therefore, to show that 
for any group G containing H as a normal subgroup, there is a subgroup L of G 
such that G is direct product of H and L. Let G be a group containing H as a normal 
subgroup. Then K, being a characteristic subgroup of H, is normal in G. Let g be any 
element of G. Then the inner automorphism i, restricted to K is an automorphism of 
K. By our hypothesis, there is an element  € H such that i, and i, agree on K. This 
means that h~!g € Cg(K). Thus, G = HCg(K). Since K is a normal subgroup of 
G, Cg(K) is a normal subgroup of G. Also H() Cg(K) = Cy(K) = {e}. This 
shows that G is direct product of H and Cg(K). tt 


Corollary 10.1.25 Let G be anon-abelian simple group. Then Aut(G) is a complete 
group. 


Proof Since G is non-abelian simple, Jnn(G) is isomorphic to G, and so Inn(G) 
is simple. In the light of the above proposition, it is sufficient to show that 
Inn(G) is a characteristic subgroup of Aut(G) whose centralizer in Aut(G) is triv- 
ial, and all automorphisms of /nn(G) are induced by the inner automorphisms 
of Aut(G). Inn(G) is already seen to be a normal subgroup of Aut(G). Let a € 
Caur(G)Unn(G)). Then aoi, = i, oa for all g € G. Hence a(ga(xya(g)! = 
ga(x)g"! for all x, g € G. This shows that g~!a(g) € Z(G) for all g € G. Since 
G is center less, a(g) = g for all ge G. This means that a = Ig. Thus, 
Caurcc)Unn(G)) = {Ig}. Next, we show that Jnn(G) is a characteristic subgroup 


10.1 Schreier Group Extensions 385 


of Aut(G). Let a € Aut(Aut(G)). Since Inn(G) is a normal subgroup of Aut(G), 
a(Inn(G)) is also a normal subgroup of Aut(G), and so a(Inn(G)) (| Inn(G) is a 
normal subgroup of Inn(G), and also of aUnn(G)). Since Inn(G) and a(Unn(G)) 
are simple, aUnn(G)) (| Inn(G) = {Ig} or else a(Inn(G)) (| Inn(G) = Inn(G) = 
a(Inn(G)). Suppose that a(Inn(G)) (\Inn(G) = {Ig}. Then the elements of 
a(Inn(G)) commute with elements of Jnn(G). This is a contradiction to the fact that 
CauG)Unn(G)) = {Ig}. Hence a(Inn(G)) () Inn(G) = Inn(G) = a(Inn(G)). 
This shows that Jnn(G) is a characteristic subgroup of Aut(G). 

Let x € Aut(UInn(G)). Then there is a bijective map 4 from G to G such 
that x(ij) = ig) for all g € G. Further, ijg,g) = X(igg) = XCgion) = 
X (ig, )X igs) = tu(g:)tu(qr). This shows that (9192) = (91) (gz) for all g1, g2 € G. 
Thus, ps € Aut(G). Now, (wigu')(x) = ing (x) = X(ig)@) for all g, x € G. This 
shows that y(i,) = oo for all g € G. Thus, yx is the automorphism of Inn(G) 
induced by an inner automorphism of Aut(G). tt 


We described the extensions of groups with trivial centers. Let us consider the 
other extreme case when center of the group is the group itself. More explicitly, we 
describe the extensions of abelian groups. Let H be an abelian group. We shall adopt 
the additive notation + for the binary operation of H. The group Out(/) is naturally 
identified with Aut(H). An abstract kernel of K to H is a homomorphism o from K 
to Aut(H). We discuss the following problem: 


Problem Let H be an abelian group. Classify all extensions of H by K (up to 
equivalence) with the given abstract kenel co. 


Let us denote by EXT, (H, K) the set of equivalence classes of extensions of an 
abelian group H by a group K with the given abstract kernel o. We have at least 
one such extension, viz., the semi-direct product extension of H by K associated to 
the homomorphism oa. Clearly, the factor system associated to the split extension is 
(K, H, 0, fo), where fp is trivial in the sense that fo(x, y) = 0 for all x, y € K. Let 
Vio (K, H) denote the set of factor systems (K, H, o, f) associated to the abstract ker- 
nel oc. Indeed, a factor system in Zz (K, H) determines, and it is uniquely determined 
by the corresponding map f which satisfies the condition 


fay) + f(xy,z) = orf, 2)) + f(x, yz) 


for all x, y, z € K. By the abuse of language, we shall call such af as a factor system 
in Zz? (K, H).f is also called a 2-co-cycle associated to (K, H, c). The justification 
for the notation Z2(K, H), and the 2- co-cycle terminology will follow later. Suppose 
that f and f’ are two members of Z2(K, H). Then 


Gay) + G7 ®) 

={G9) FIO +7) +7 Oro 

= ox(f(,2z)) + fF, yz) + orf’, 2) + fC, yz) 

= ox(f +f)0,2)) + F +f), yz). 

This shows that f +f’ is also a factor system associated to the abstract kernel co. 
Also —f € Z2(K, Hf) for all f € Za, K). Thus, ZK, #7) is an abelian group with 


386 10 Group Extensions and Schur Multiplier 


respect to the operation defined above. fo is the identity of the group. Let B2(K, H) 
denote the set of factor systems which are equivalent to the trivial factor system fo. 
More precisely, from (10.12), f € B2(K, H) if and only if there is a map g from K 
to H with g(e) = Osuch that f(x,y) = ox(g(y)) — g(xy) + g(x) (see Eq. 10.12 
written additively). Note that for any map g with g(e) = 0, f defined by f(x,y) = 
ox(g(y)) — g(xy) + g(x) isa factor system. The members of Be (K, ) are called the 
2- co-boundaries associated to (K, H, a). The quotient group vb (K, H) /B. (K, H) 
is called the second co - homology group associated to (K, H, 7), and it is denoted 
by H2(K, H). 


Theorem 10.1.26 Let H be an abelian group, and K be a group. Let o an abstract 
kernel of K to H. Then, there is a natural bijective correspondence V between the set 
EXT, (H, K) of equivalence classes of extensions of H by K with the given abstract 
kernel o and the second co-homology group H?(K, H). 


Proof Let E be an extension of H by K with the abstract kernel o. Let t be a 
section of the extension, and (K, H, o,f‘) be the corresponding factor system. Then 
f' € Z2(K, H).Let E’ be another equivalent extension of H by K, andr’ be asection of 
the extension E’. Let (K, H, o,f") be the corresponding factor system. Then (see the 
Eq. 10.18) there is a map g from K to H with g(e) = Osuch that f'(x, y) + g(xy) = 
ox(g(y)) + g(x) + f' (x, y). This shows that f’ + B2(K,H) = f° + B?(K,#H). 
Thus, the association (E, t) > f' induces a map I from EXT, (H, K) to He (K, H) 
given by [((E]) = f’ + B.(K, #1), where ¢ is a section of E. Let f € Z(K, A). 
Then by the theorem 10.1.11, there is an extension E of H by K, and a section f such 
that f’ = f. This shows that I is surjective. Let E, and E be extensions of H by 
K with sections ft; and f) and abstract kernel o such that [([£,]) = I([{E2]). Then 
f!' + B(K,H) = f? + B2(K, H). Hence there exists a map g from K to H with 
gle) = Osuch that f" (x,y) + g(xy) = ox(gQ)) + g@) + f?(, y). It follows 
that the factor system f" is equivalent f”. Hence the corresponding extensions EF; 
and E, are equivalent. tt 


Let H be a group (not necessarily abelian), and K be a group. Though, H may not be 
abelian, we use the additive notation + for the operation in H, and also for the oper- 
ation in any extension G of H by K. The operation of K is denoted by juxtaposition. 
Thus, the identity of H is denoted by 0, and that of K by e. Let : K +> Out(H) = 
Aut(H)/Inn(#) be an abstract kernel. Since the center Z(H) of H is a characteris- 
tic subgroup of H, we have a homomorphism y : Aut(H) +— Aut(Z(A)) given by 
x(a) = a/(Z(A). Let 0 : K +> Aut(A) be a lifting of 7 with o(e) = Jy in the 
sense that voo = w, where v is the quotient map from Aut(A) to Out(A). Since w is 
a homomorphism, o(xy)Inn(H) = (a(x)o(y))Jnn(A). Hence there is a map f from 
K x KtoH suchthato(x)o(y) = ig¢,yyo (xy) (recall that i, denote the inner automor- 
phism determined by /). It follows that (yoo) (xy) = (yoo) (x) (yoo) (y) forall x, y € 
K. This means that yoo is ahomomorphism from K to Aut(Z(H)). Let 7 be another 
lifting of ~. Then o(x)Inn(H) = 7T(x)Inn(A) for all x € K. Hence there is a map 
g from K to H with g(e) = 0 such that o(x) = ig¢y7T (x) for all x € K. But, then 
(yoo (x)) = (yor(x)) for all x € K. Thus, yoo depends only on w and not on any 


10.1 Schreier Group Extensions 387 


particular lifting o. In turn, x induces a map x from the set Hom(K, Out(H)) of 
abstract kernels from K to H to the set Hom(K, Z(A)) of abstract kernel from K to 
Z(H) given by ¥(w) = xoo, where a is a lifting of ~. 


Proposition 10.1.27 Let 


FHi—sn pete] eS 


and 
a’ 


EF =1 sHS-C 4 ® > 1. 


be extensions of H by K such that vg = We = wW. Then there is a section t of E 
and a section t' of E' such that o' = o" = X(w), and —f'(x,y) + f'(y) € 
Z(H) for all x,y € K. Further, then the map h from K x K to Z(H) defined by 
h(x, y) = —f'(x, y) + ft (x, y) is a2 co-cycle in Zey (K, Z(H)). 


Proof Let t be a section of E, and s be a section of E’. Since Wg = wp, 
o'(x)Inn(H) = o%(x)Inn(A) for all x € K. This means that there is a function g 
from K to H with g(e) = 0 such that o'(x) = iggjo°(x) for all x € K. The map 
t’ from K to G’ given by t’(x) = g(x) + s(x) is also a section of EF’. Further, 
then o (x) = igayo°(x) = o'(x) for all x in K. This shows that c’ = o”. Now 
fy) = t) + 0) — toy) and f"(x,y) = 1@) + 1) — ty). Hence 
ipxyy = a Qo (ol Qy)) 7! = of Qo OW)" Gy)! = ipqy forallx, y € K. 
Thus, i_pi(.y) 4 f(y) = Ln. This shows that —f'(x,y) + f(x,y) € Z(A) for all 
x,y €K.Puth(x,y) = —f'(x,y) + f(x, y). Then 


hee, y) + hay, 2), | 
= fi@y) + fi @y) — f'Oy,2) + fi Gy, 

= —fiay,y) — fi@y) + fy) + fy, 2) 

= -('@,y) + f'Gy.2) + fey) +f yD 

= —(,(7'(,2)) + fi, y2)) + FG.) + fy, 2) 

= —f'@, yz) — 0,070.2) + 0.2) + FP y2) 
= fi@ yo + o(f'O.9 + f° 0,2) + fi O, yz), for ot = o! 
= —f'(@, yz) + fi) + o(-F'O.9 + f°0,2) 

= o,(-f'0,2) + f°0,2)) — fi y2) + FPO, y2) 

= 9,(hy, 2)) + he, yz) 


for all x, y, z € K. This shows that h € Zw) (K, Z(A)). tt 


Theorem 10.1.28 Let : K + Out(H) be an abstract kernel from K to H which 
is realizable by an extension of H by K. Then the second co-homology group 
ew (K, Z(H)) acts sharply transitively on the set EXT,,(H,K) of equivalence 
classes of extensions of H by K associated to the abstract kernel w. 


Proof Let E be an extension of H by K which realizes the abstract kernel w, and 
t be a section of E. Let (K, H,o', f*) be the corresponding factor system. Then 


388 10 Group Extensions and Schur Multiplier 


W(x) = ollnn(A) for all x eK. Lethe Ze (K, Z(H)). It is easily seen that 
(K,H,o',f' + h)isagaina factor system. Let E « h denote the eotesponins exten- 
sion. Clearly E « h also realizes w. Let h’ be another 2 co-cycle in Ze yy (K, Z(A)) 
such that the co-homology class [4] = h + Be yy (K,Z(H)) = [A] = + 

Bey (K, Z(A)) in Foy (K, Z(H)). Then there is amap g: K +> Z(H) C A with 
g(e) = Osuchthath'(x, y) = Og(x, y) + h(x, y) forallx, y € K.Clearlyf’ + A’ = 
f' + h + Og. Hence (K,H,o',f' + h) is equivalent to (K,H,o',f' + h’). 
This shows that [Exh] = [E«h’]. Let FE and E’ be equivalent extensions of H 
by K which realize w and h é€ Ze w(K, Z(H)). By the Theorem 10.1.12, we have 
sections ¢ and ¢’ of E and E" respectively such that (K, H, 0‘, f‘) is equivalent to 
(K,H,o',f"). Hence, there is a map g from K to H with g(e) = 0 such that 
fi@y) + gay) = g(x) + o'(~)(gQ)) + f'(,y) for all x,y € K. Clearly, 
(K,H,o',f' +h) is equivalent to (K,H,o",f" +h), and so [Exh] = [E’ xh]. 
Thus, we get an action * of Ew) (K, Z(H) on EXT, (H, K) given by [E] * [A] = 
[E * h]. We show that this action is sharply transitive. Let E and E” be extensions real- 
izing the abstract kernel 7. By the Eropostipe 10.1.27, there is a section t of E, and 
there is a section 1’ of E’ such that o' = of = X(w), and the map h from K x K to 
Z(H) defined by h(x, y) = —f'(x, y) + f* (x, y) is a2 co-cycle in Z2 <(w (K, Z(A)). 
Clearly, [E] * [i] = [E"]. This shows that the action « is transitive. Next, suppose 
that [E] * [h] = [E]. Then there is a section t of E such that the factor system 
(K, H, o', f') is equivalent to (K, H, 0’, f' + h). Hence there is a map g from K to H 
with g(e) = Osuchthatf"(x, y) + hx, y) + gary) = g(x) + a (g0)) + f'@, y) 
for all x, y € K and also g(x) + of(h) — g(x) = of (h) forallx ¢ K andh eH. 
Since o/ is an automorphism of H, it follows that g(x) € Z(H) for all x € K. Thus, 
ha,y) = of(g0)) — gy) + g(x) for all x,y € K. This shows thath = 0g, 
where g is a map from K to Z(H) with g(e) = 0. It follows that [h] = 0. This 
completes the proof of the fact that the action * is sharply transitive. tt 


eee! 10.1.29 There is a bijective correspondence between EXT,(H, K) to 


Fey) (K, Z(A)) provided there is an extension of H by K which realizes w. tt 


Let H be an abelian group and K a group. As H2(K, H) is an abelian group, 
the bijective map I’ (see Theorem 10.1.26) induces a group structure on the set 
EXT, (H, K) of equivalence classes of extensions of H by K with the given abstract 
kernel o. We shall try to describe the induced addition called the Baer sum on the 
class of extensions. Let 


E, =1 > 


and 
R= is AS ae SS FS 1, 


be two extensions of H by K with abstract kernel a. We have the extension E; © E 
of H @H by K @ K given by 


10.1 Schreier Group Extensions 389 


hOm=1— Hon" Gog," rox — 1. 


Using this, we construct an other extension of H by K. Let A denote the diagonal 
map from K to K @ K defined by Av) = (x, x). We have the pull back diagram 


LL = 4 k 


| soot | 
Bi B Po 


G,|@G,__, KOK, 


where L = {(g1, 92) | Gi (gi) = 22(g2)}, i the inclusion map, and x is given by 
x(g1,.92) = (1(g1) (ensure that it is a pull back diagram). Clearly, y is a surjective 
homomorphism, kerx = {(g1, 92) €L | x(gi, 92) = Filgi) = e} = {C91 92) | 
e = Pi(q) = (22(g2)} = HOH. Thus, we have an extension A*(E @ E) of 
H @H by K given by 


ME, @B) =1— s+ Fen STS eS 1 


Now, let V denote the co-diagonal map from H © H to H given by V(hy, ho) = hy + 
hy. Since H is an abelian group, V isahomomorphism. Let D = {(h,h7') | h € H}. 


Then D is a normal subgroup of L, and we have the push out diagram (verify) 
ay B a2 
H@®H __, L 


4 ot 


——————_- > 


where G = L/D, v the quotient map, and 77 is given by n(h) = (h,0) + D. 
Clearly, 7 isahomomorphism. Suppose that 7(1) = D. Then (A, 0) € D. This implies 
that h = 0. Hence 7) is injective. Again the map y from L to K takes (h, —h) 
to 3;(h) = e. This shows that y induces a surjective homomorphism X from G 
to K given by X((gi. 92) +D) = (1(g1). Also kerX = {(g1,g2)+D | e = 
Bila) = G2(g2)} = {,2)+D | y,ho € A} = (A +m)+D | hy, € 
H} = imagen. Thus, we get an other extension 


xX 


1 sHAG > K > 1. 


of H by K, called the Baer sum of £; and Ep, and it is denoted by E, (J Eo. Further, 
let t; be a section of Ej, and ty be a section of Ey with corresponding factor system 
f" and f”. Then we have the section t) + t2 of E; kJ Eo given by (4) +h)(~) = 
(t)(x), ta(x)) + D. It can be easily seen that f"t? = f" +f. It follows that the 
bijective map I from the set EXT,,(H, K) of equivalence classes of extensions of 


390 10 Group Extensions and Schur Multiplier 


H by K to the second co-homology group Hy (K, H) respects the addition. In turn, 
EXT,,(H, K) is an abelian group with respect to the Baer sum, and it is isomorphic 
to H;,(K, H). t 


Proposition 10.1.30 Let H be an abelian group, and K be a group of order m. Then 
mH? (K, 1) = {0} for any abstract kernel o : K +—> Aut(H). 


Proof Let f € ZK , 1). Consider the map g from K to H given by g(x) = 
Leeks (x, z). Clearly g(e) = 0. Now, 


Og(x,y) = ox(gQ)) — gry) + gx) 

= 0x(Deexf(y, 2) — Leexf xy, 2) + Vex fr, 2) 

= Lexx, 2))) — fay, z) + f, z) 

= mf(x,y) + Lex —f,y) — fay.2) + of, 2)) + FO, 2 

= mf (x,y) + Lexar, 2) — fry, 2) + f(x, yz) — f(x, y) (for A is abelian) 
= mf (x,y) + Lex Of (x, y, 2) 

= mf (x, y) (forf € Z2(K, H)). 


Hence mf € B. (K, H). This shows that m[f] = 0. tt 


Corollary 10.1.31 Let H be an abelian group of order n, and K a group of order 
m such that (m,n) = 1. Then H7(K, H) = {0} for all abstract kernel 0 : K -> 
Aut(H). 


Proof From the above proposition, mH?(K,H) = {0}. Since nf = 0 for all maps 
f from K x K to H, it follows that nH? (K,H) = {0}. Since m and n are co-prime, 
H}(K,H) = {0}. tt 


Corollary 10.1.32 Let H be a finite abelian group of order n, and K a group of 
order m, where (m,n) = 1. Then every extension of H by K splits. 


Proof It follows from Theorem 10.1.26 that EXT, (H, K) is in bijective correspon- 
dence with H is (K, H). From the above corollary, it is evident that there is only one 
equivalence class of extension, and indeed, it is the split extension. tt 


Let H and K be finite groups of co-prime orders. Z(H) and K are also of co-prime 
orders. It follows from the Theorem 10.1.28 and the above corollary that EXT), con- 
tains at most one element. More explicitly, an abstract kernel ~ : K +> Out(H) is 
either not realizable from an extension of H by K or there is only one equivalence 
class of extensions of H by K associated to the abstract kernel ~. However, it is not 
clear if the unique extension is split extension. The following theorem asserts that it 
is, indeed, a split extension even if H is non-abelian. 


Theorem 10.1.33 (Schur—Zassenhauss) Let G be a finite group having a normal 
subgroup H such that H and G/H are of co-prime orders. Then G is a split extension 
of H by G/H (equivalently, G is semi-direct product of H with G/F). 


10.1 Schreier Group Extensions 391 


Proof \fH is abelian subgroup of G, then the result follows from the above corollary. 
We prove it for general case. The proof is by induction on the order | G | of G. If 
| G | = 1, then there is nothing to do. Assume that the result is true for all those 
groups L for which | L | <| G |. We prove it for G. Let H be a normal subgroup of 
G such that | H | is co-prime to | G/H |. Suppose that there is a proper subgroup K 
of G such thatG = HK.Then K ()H is anormal subgroup of K with (| (K () H) | 
, | K/K (4 |) = 1 (mote that K/K () H is isomorphic to KH/H = G/H). Since 
| K | <| G |, by the induction hypothesis, there is a compliment L of K () H in K. 
But, then K = (K(|H)LandK ()H()L = {e}.HenceG = H(K()H)L = AL 
and H()\L = {e} (for L C K). This proves the result for G in case G has a proper 
normal subgroup K suchthatG = HK. Next, assume that there is no proper subgroup 
K of G such that G = HK. Suppose that there is a nontrivial normal subgroup M 
of G which is properly contained in H. Then H/M is a normal subgroup of G/M 
such that H/M and (G/M)/(H/M) © G/H are of co-prime orders. By the induction 
hypothesis, there is a subgroup L/M of G/M such thatG/M = (H/M)(L/M) and 
(H/M) (\(L/M) = {M}. In other words G = HL and H()\L = M. But, then 
L = GandsoM = H(\L = H()\G = H,acontradiction to the supposition 
that M is properly contained in H. Thus, H is a minimal normal subgroup of G. Let 
p bea prime dividing the order | H | of H. Let P be a sylow p - subgroup of H. Since 
(p,| G/H |) = 1, Pisasylow p - subgroup of G also. Further, since all the sylow 
p - Subgroups of G are conjugate, and they are all contained in H,G = HNg(P). 
Hence Ng(P) = G. In other words P is anormal subgroup of G which is contained 
in H. Since H is a minimal normal subgroup of G, H = P is ap - subgroup. Thus, 
the center Z(H) # {e}. Since Z(H) is acharacteristic subgroup of H, and H is normal 
in G, it follows that Z(H) is normal in G. Again the minimality of H ensures that 
H = Z(A) is abelian. From the previous corollary, G is split extension of H by 
G/H. tt 


10.2 Obstructions and Extensions 


Let us discuss the conditions under which an abstract kernel ~ of K to H can be 
realized from an extension of H by K. Here, H is not assumed to be an abelian 
group. Let co be a map from K to Aut(H) witho, = Jy such that ~(x) = o,Jnn(A) 
for each x € K (axiom of choice ensures that such a map exists). Such a map o will 
be termed as lifting of 7. Since w is a homomorphism, o,0,J/nn(H) = oyyInn(H). 
Let f be amap from K x K toH withf(e,x) = 1 = f(x, e) forallx © K such that 


OxOy = ify Oxy v7 (10.19) 
for all x, y € K (existence of such a f is ensured by the axiom of choice). Now, 


OxOyFz = Uf(xyyPxyFz = Uf (xyy f(xy.) Pxyz- 


392 10 Group Extensions and Schur Multiplier 
On the other hand, 


‘ % =) é ‘ 
Ox0yOz, = Oxlf(y,z)Pyz = Axlf(y,z) (ox) OxOyz, = lox (f(y.2)) fF (x.yz) Fxyz* 
Equating both the expressions, 


Freyyfay.e) = lo fo.ayf ony) 
for all x, y,z € K. Hence there is a map ¢ from K x K x K to Z(H) with 1 = 
o(e,y,z) = (x, e,z) = (x, y, e) such that 


f(x, yf Gy. z) = O, y, Dox, DFO, yz) «++: : (10.20) 


Clearly, f is a factor system in Z2(K, H) if and only if ¢ is identically trivial. It is 
natural, therefore, to term ¢ as the obstruction to f for being a factor system. This 
is also call an obstruction associated to the abstract kernel w. 

We have the following proposition. 


Proposition 10.2.1. An abstract kernel y can be realized from an extension if and 
only if one of its obstruction is trivial. tt 


We further analyze the obstruction @, and its dependence on the choice of o and the 
function f. First d(x, y, z) € Z(#), and the center is a characteristic subgroup. Thus, 
0x(0(y, z, t)) € Z(A) for all x, y, z, t € K. For convenience, we adopt the additive 
notation for the operation of H. Note that H need not be abelian. However, Z(H) is 
abelian. Thus, the Eq. (10.20) reads as 


fay) + foy.2d = ,y,2 + of, 2) + fx, yz) vee . (10.21) 


Proposition 10.2.2 An obstruction @ associated to an abstract kernel w is a 
3-cocycle in the sense that 


ax(d(y, Zz; t)) — oxy, Zz; t) aa P(x, YZ, t) = o(x, y, Zt) a P(x, Y, Z) = 0 


Proof Using (10.21), we express f(x,y) + f(xy,z) + f(xyz, ft) in two different 
ways. 


fy) + f(xy, z) + flaryz, t) 

= o(,y,z) + ox. 2)) + fx, yz) + fGyz, 8) 

= $4, y,2) + Ox, 2) + OG, yz, 1) + orf 2,02) + fC, yzt) 

= Ox, y,z) + Ox, yz, t) + ox ,2) + £020) + f@, yet) 

= bx, y,z) + Ox, yz, t) + 0x(00,2.0) + oF + FO, 2t)) + £@, yet) 
= 9,2) + O0, yz,01) + Ox(60,2,0) + arly (F(z,.0) + of Q. 2t)) + 
f, yet) 


10.2 Obstructions and Extensions 393 


= O,y,2) + 0, yz,t) + ox(00,2,0) + oxloyfZ.0) — 6@,y,2t) + 
fy) + f(xy, zt) 

= Ox, y,27) + b0,yz,.t1) + O(00,2.0) — O@,y, zt) + fO.y) + Ory 
(f(z, t)) + f(xy, zt). 


On the other hand, 


f@,y) + fry, 2) + fryz,t) 
= f(x,y) + 607,29 + oy(f(Z.0) + fry, zt) 
= d(xy,z,t) + fy) + oy (f(z) + fry, zt) 


Equating the two expressions for f(x, y) + f(xy, z) + f(xyz, t), we get the desired 
result. tt 


Since Z(H) is a characteristic subgroup of H, and o, is an automorphism 
of H, 0,/Z(H) € Aut(Z(A)). Again, since 0,0) = ify) Oxy, Ox/Z(A)oy/Zy = 
Oxy/Z(H). Thus, o induces a homomorphism from K to Aut(Z() which associates 
x to o,/Z(H). This induced homomorphism is again denoted by co. Further, if 7 is 
another lifting of 7, then o,.Jnn(H) = 1,Jnn(#) for all x € K. Hence there is a map 
g from K to H with g(e) = 1 such that 0, = ig¢,.)T; for all x. This means that the 
induced homomorphisms o and 7 from K to Aut(Z(H)) are same. It follows that the 
induced homomorphism o from K to Aut(Z(#)) is independent of the lifting o, and 
it depends only on the abstract kernel w. Let Zi (K,Z(H)) = ze (K, Z(A)) denote 
the set of 3 co-cycles associated to the abstract kernel ~. Then this is an abelian 
group with respect to the obvious addition, and it is called the group of 3 co-cycles. 
Thus, for each choice of f satisfying the Eq. 10.19, we obtain an obstruction @ in 
Ze (K, Z(H)) described by the Eq. 10.21. 

Now, we examine as to how the obstruction changes with different choices of 
the function f satisfying (10.19). Let f’ be another map from K x K to H with 
f'(e,y) = f'(x, e) = Osuch that o,0) = ip (x,y) xy. Let ¢’ be the obstruction to f’. 
Then there is another map p from K x K to Z(#) such that 


f'@.y) = fy) + py) 
for all x, y € K. Also 
f'@y) +f Gy.2 = 8@.Y2 + ar’. 2)) + fC, yz). 


Putting the values of f’ (x, y), we get 


fay) + f(xy, 2) + p@,y) + pay,2= 
@ (@,y,z) + 020.2) + p&.2)) + fG.yz) + Pp, yz): 


In turn, 


OX, 9,2) + Oxf, 2)) + yz) + p(t, y) + pry, 2) = 
P(X, 2) + OxFO.2) + ar(P,2)) + f(, yz) + p(x, yz). 


394 10 Group Extensions and Schur Multiplier 


Thus, there is a map p from K x K to Z(H) with p(e,y) = 0 = p(x,e) for all 
x, y € K such that 


(x,y,z) — & (x,y,z) = ox(p0,2)) — play.z) + p(x, yz) — pG, y). 


Let us call a map 6 from K x K x K to Z(H) a3 co-boundary if there is a map p 
from K x K to Z(A) with p(e, y) = 0 = p(x, e) for all x, y € K such that 


d(x, y,z) = Op(x,y,z) = ox(pQy,2)) — pry,2) + p(x, yz) — p(x, y). 


for all x, y, z € K. It can be easily verified that a 3 co-boundary is also a 3 co-cycle, 
and the set B3(K , Z(H)) of co-boundaries is a subgroup of the group Zz (K, Z(A)). 
The quotient group Zz (K, Z(H))/ BR (K, Z(#)) is called the third co-homology group 
denoted by H3(K, Z(H)). It follows that ¢ + B3(K,Z(H)) = ¢' + B3(K, Z(A)). 
Thus, ¢ + B3(K, Z(H)) is independent of the choice of f, and we have a map Obs 
(called the obstruction map) from the set Hom(K, Out(H)) of abstract kernels to the 
third co-homology group H? (K, Z(H)) defined by Obs(w) = @ + Be (K, Z(A)), 
where ¢ is an obstruction to the choice f. 


Proposition 10.2.3. An abstract kernel w) can be realized by an extension if and only 


if Obs() = 0. 


Proof Suppose that ¢ can be realized by an extension. Then there is an extension E 
of H by K. Let t be a section of E. Then it gives rise to a factor system (K, H, 0, f*) 
with {(x) = olInn(A) (note that o restricted to the center Z(H) is independent of 
t). In turn, 

Ox) = Ipt(xy) ry 


and 


fi y) + fi@y.z) = of, 2) + f(x, yz). 


This shows that the obstruction ¢ to the choice f’ is 0. Hence Obs(W) = 0. Con- 
versely, suppose that Obs(w) = 0. Then there is a map o from K to Aut(H), anda map 
f from K x K toH such thato, = Iy, f(e,y) = 0 = fe), W(x) = o,Inn(A) 
and o,0) = if(y.y) Oxy for all x, y € K. By our assumption, the obstruction ¢ to f is a 
co-boundary. Hence there isamappfromK x K toZ(H) withp(e, y) = 0 = p(x, e) 
for all x, y € K such that 


0, y, 2) = ox(V, 2)) — pry, z) + pr, yz) — pG@,y). 
In other words, by (10.21), 


fay) + fay,2 = 
ox(p(y, Z)) — p(xy,z) + pr, yz) — py) + orf, 2)) + fi, yz). 


for all x,y,z € K. Take f’ = f — p. Theniry.y) = ip(yy for all x, y € K and 


10.2 Obstructions and Extensions 395 


f'@y) + f'@y,2) = or(f',2)) + Ff’, yz). 


This shows that (K, H, o, f’) is a factor system. Let E be the corresponding extension 
of H by K. Clearly, the associated abstract kernel is the given abstract kernel 7. { 


Proposition 10.2.4 Let H and K be groups, and ¢ be a 3 co-cycle representing an 
element in He (K, Z(H)), where o is ahomomorphism from K to Aut(Z(H)). Assume 
that K contains more than two elements. Then there is a group (free in some sense) 
G with Z(G) = Z(A), and an abstract kernel y € Hom(K, Out(G)) inducing the 
homomorphism o from K to Aut(G) such that Obs(w) = [¢]. 


Proof Let be a3 co-cycleinZ3(K, Z(A)).LetX = K x K — K x {e} — {e} x K 
and F(X) the free group on X. For convenience, we use the additive notation for F(X) 
also, although it is non-commutative. LetG = Z(H) x F(X) be the direct product 
of Z(H) and F(X). Then Z(H) and F(X) are naturally identified as subgroups of G. 
Indeed, for convenience, an element (a, u) € G, a € Z(A), u € F(X) willbe written 
asa + u. Since X contains more than one element, it follows that Z(G) = Z(H). 
For each x € K, we extend the map o, to an endomorphism 0, of G by defining it 
on the free generating set X of F(X) as follows: 


Ox, 2) = Ot, y,2) + Gy) + (ry, 2) — OG ye)rrree: (10.22) 


for all x, y,z € K — {e}. Since @(x, y, z) = O whenever any of the x, y, z are e, the 
identity (10.22) will make sense for all x, y, z € K if we identify (x, e) and (e, y) 
with the identity of F(X) for all x, y € K. Clearly, o, = Ig. We show that 


Te Dy = ey Oy rr (10.23) 


Since o, is an extension of o, for all x € K, and o is a homomorphism from K to 
Aut(Z(H)), fora € Z(A), 


Ox(Gy(a)) = 0x(o;(a)) _ Oxy (a) = @,y) + Oxy (a) — (x,y). 


Thus, both sides of (10.23) are equal when restricted to Z(H) = Z(G). Itis sufficient, 
therefore, to show that both sides of (10.23) evaluated on (z, t) give the same result 
for all z, t € K. Using (10.22) and the fact that @ is a 3-co-cycle, we get 


ox (Gy ((z, £))) 

= 6x5(60,2,0 + 0.2) + 02,0) — &, Zt) 

= 0,(O(, 2,0)) + Ox((,2)) + Ox((z,t)) — ox(G, zt)) 

= 0,(60,2,0) + O@,y,2) + Gy) + @y,z) — OG, yz) + O@,yz,t) + 
Oye) + Gye.) = Gye) — (OX, 9.21) + + Oe.) — yet) 

= 0;x(60,2,0) + O@,y,2) + Gy) + @y,z) — Gye) + OG, yz,t) + 
G32). + Gye.) =: Gye) > Gye) — Gy) — Gy) = OC, y,<2) 

= 9;(60,2,t)) + O,y,2z) + OC, yz, t) — by, zt) + (,y) + (xy, z) — 
(x, yz) + (yz) + (xyz, t) — (X, yet) +, yet) — (ay, zt) — (x, y) 


396 10 Group Extensions and Schur Multiplier 


Gy) + $0y.2.9 + G92 + (2) — Oy 2) — Gy) 
= ig.) FwO.2) 


Thus, 


Ty = ia Fo 


for all x, y € K. In particular, 


Ox Ox-1 = Ux,x-!)Oxx-) = Uy,x-!) 


This shows that oy is surjective, and o,-1 is injective for all x € K. Hence oy is an 
automorphism of G. It follows that o induces a homomorphism w from K to Out(G) 
whose obstruction is the given co-cycle @. tt 


The abstract kernel 7, thus obtained, is universal in some sense. Let y be a 
homomorphism from K to Out(H) such that Z(H) = Z(K), and the obstruction of 
x is the obstruction of 7). Then there is a map 7 from K to Aut(H) with 7, = Iy and 
amapf from K x K toH with f(e,y) = 0 = f(x, e) such that 


(i) TxTy = ifcey) Tay 

(ii) T = oO, when restricted to Z(H) = Z(G) and 

(iii) f@,y) + f@y.2 -— f@.y) -— FO.2) = @y,2 = &y) + 
(xy, Zz) — @, yz) — O(, 2) 


for all x, y, z € K. Using the universal property of the free group F(X), we get a 
unique homomorphism p from G to H subject to p((x, y)) = f(x, y) and p(h) = h 
forallh e Z(H) = Z(G). Inturn, poo, = top forallxe K. 


Example 10.2.5 In this example, we discuss the extensions of a group by a free group 
F (say). Let w is an abstract kernel of F to H. In other words w is a homomorphism 
from F to Out(A). Since F is free, we have a homomorphism o from F to Aut(H) 
such thatv og = w.The semi-direct product extension induced by o is an extension 
of H by F with abstract kernel w. Further, since F is free, any extension E given by 


E=1 sHSGArF > 1. 


splits. Thus, corresponding to any abstract kernel of F to H, there is one and only 
one equivalence class of extension of H by F which is a split extension. In particular, 
H3(F,H) = {0} 


Example 10.2.6 In this example we discuss the extensions of a group by a cyclic 
group. The case of infinite cyclic group is already included in the above example. We 
discuss the extension of a group H by the cyclic group Z,,, m > 2. Let w be an abstract 
kernel of Z,, to H. Let oz be an automorphism of H such that o7/nn(H) = (1). Then 
the map o from Z,, to Aut(H) given by 0; = (07)! is a lifting of the abstract kernel 
w. Clearly, (7) € Inn(H). Note that, for convenience, the image of i under the map 
a is being denoted by o;. Let GG = H x Z,,. Let ho € H such that o7(ho) = ho. 


10.2 Obstructions and Extensions 397 


Note that at least one such ho exists, for if worst comes fp = 1 will do. Define a 
product in G by - ; 
(a, i)(b,j) = (ao;(b),i+j) 


incasei+j <m-— 1, and 
(a, i)(b,j) = (ao;(b)ho, i +j), 
otherwise. It can be easily seen that G is a group. The identity is (1, 0), the inverse 


of (h, i) is (k, m — i), where oj(k) = h7'(ho)~!. We have an extension E of H by 
Zm given by 


a Bp 


E=1 > H> Gs Zn > I, 


where a(a) = (a, 0) and (3 is the second projection. Consider the section f of E given 
by ¢(i) = (1,i). Then (o(h), 0) = t@(h,Ov¢t@)~'! = C,dh,0)0,)7! = 
(0;(h), i)(4g',m — i) = (a;(h)ho ‘ho, 0) = (o;(h), 0). This shows that o! = a. It 
follows that the abstract kernel associated to this extension is 7. Thus, every abstract 
kernel of Z,,, to H can be realized from an extension of H by Z,,. Note that f‘ (i, D =1 
for i+j <m-— 1 and ho otherwise. Observe that (t))" = (0, 1)" = (ho, 0) 
and (az) = ing. 

It also follows from the above discussion that any extension of H by Z,,, with given 
abstract kernel ~ is determined by a choice of an element ho of H which is fixed by o. 
There is a corresponding section ¢ such that t(1)” = hg. Let t’ be an other section of 
the extension with o’ = o' = o.Then there is an element a € H such that 7’ (qd) = 
(a, 0)t1) = (4,0), 1) = G, Dand (o7(h), 0) = ig_((h, 9)) = ig) (h, 0) = 
(a, 1)(h, 0)((a, 1))~! = (aoz(h)a™', 0). This shows that aoz(h)a~! = o7(h) for all 
h € H. This means that a € Z(H). Also, t/(1)”" = (a, 1)" = (N,(a)ho, 0), where 
N, is amap from H to H given by N,(h) = ho(h)o?(h)---a'"~!(h). The map N, is 
called norm of co. Clearly, V, maps Z(H) to itself, and when restricted to Z(H), it is 
a homomorphism. Also N,(a) and N,(a)ho are invariant under o. Clearly, the choice 
h, = N,(a)ho determines another equivalent extension of H by K with prescribed 
abstract kernel y in the manner described above, and also any equivalent extension 
determines such an element a in Z(H). To summarize, we have the following: 

“Let H be group, and m be a natural number. Let w be an abstract kernel of 
Zm to H, i.e., w is a homomorphism from Z,, to Out(H). There is a lifting map 
o (not necessarily a homomorphism) from Z,, to Aut(H)with o(0) = Iqsuch that 
ai(h)Inn(A) = w(h) for allh € H. Let Fix(c) denote the set {h ¢ H | o(h) = h} 
of elements of H fixed by o. Clearly, Fix(c) is a subgroup of H. Also, N,(Z(A)) is a 
normal subgroup of Fix(a). The construction described above which constructs, for 
each h € Fix(c) an extension of H by Z,, induces a natural bijection from the group 
Fix(a)/N,(Z(H)) to the set EXT, (Z,,, H) of equivalence classes of extensions of 
H by Z,. In turn, if H is an abelian group, then it induces an isomorphism from 
Fix(o)/N,(H) to the second co-homology group H?(Zm, H). In particular, if o is 
trivial, then Fix(o) = HandN,(H) = {h” | h © H}. Thus, in this case H?(Z,,, H) 


398 10 Group Extensions and Schur Multiplier 


is the quotient group H/H”. In particular, H?(Z,,, Z) is isomorphic to Z,. Also if 
D is a group such that every element of H is a m"" power, then H?(Zm, H) = {O}” 


Exercises 


10.2.1 Find all extensions of (i) Z by Z, 
(ii) Q by Z, 

(iil) Z by Z x Z, 

(iv) a finite group by Z, 
(v) Qs by Zo, 

(vi) Dg by Zm> 

(vii) Ss by Zo, 

(viii) Ag by Zo, 

(ix) Zo by Zo x Zp, 

(x) Zp x Zy by Zp, 

(xi) Os by Vs 

up to equivalence. 


10.2.2 Characterize groups all of whose extensions are split extensions. 
10.2.3 There are several splittings of a split extension. How are they related? 
10.2.4 Give a proof of the Proposition 10.1.17. 


10.2.5 Show that the kernel of the natural homomorphism 7 from Aut(G) to 
Aut(G/Z(G)) is Caung)Unn(G)). 


10.2.6 Show that a group G is free if and only if every extension by G splits. 


10.2.7 Show, by means of an example, that the number of non-isomorphic groups 
G having a normal subgroup H such that G/H is isomorphic to a fixed group K may 
be strictly less than the number of equivalence classes of extensions of H by K. Hint. 
Look at the Exercise 10.2.1 (ix). 


10.2.8 Describe the extensions of H by Z, x Zn. 


10.2.9 Describe all extensions of a group of order 5 by a group of order 4. Hence 
describe all groups of order 20. 


10.2.10 Let 7 be the automorphism of the Kleins four group V4 given by T(a) = 
b, T(b) = c. Find ie (Z3, V4), where o is the homomorphism from Z3 to Aut(V4) 
given by oj = T. 


10.3. Central Extensions, Schur Multiplier 


In this section we study those extensions of abelian groups for which the associated 
abstract kernels are trivial homomorphisms. In other words, we are interested in 
extensions G of an abelian group A by a group G for which A is a subgroup of the 
center of G. More precisely, we have the following definition. 


10.3 Central Extensions, Schur Multiplier 399 


Definition 10.3.1 An extension E of H by K given by 


feoti—espie6g! esc 


is called a central extension if a(H) C Z(G). 


Example 10.3.2. Any group G is naturally a central extension of its center Z(G) by 
the group /nn(G) of its inner automorphisms. Thus, the quaternion group Qs is a 
central extension of Z2 by the Kliens four group V4. 

Let F be a field, and GL(n, F) be the general linear group of invertible n x n 
matrices with entries in the field F. The center Z(GL(n, F)) of GL(n, F) consists 
of all scalar matrices al,, ae F* = F — {0}. The quotient group GL(n, F)/Z 
(GL(n, F)) is called the projective general linear group, and it is denoted by 
PGL(n, F). Thus, GL(n, F) is a central extension of its center by the projective 
general linear group PGL(n, F’). We have the short exact sequence 


tas eS Cre =. PCL Fy S31 4205 (10.24) 


where F™ is the multiplicative group of the field, a is the map given by a(a) = aly, 
and v is the quotient map. The above exact sequence represents a central extension of 
F* by PGL(n, F). Mostly, groups can be represented as a subgroup of a linear group. 
Recall that a homomorphism p froma group G to the group GL(n, F) is called a linear 
representation of G of degree n over the field F (see Chap. 9). A homomorphism from 
a group K to PGL(n, F) is called a projective representation of K over F. It may 
not be possible to lift a projective representation p from a group K to PGL(n, F) 
to a linear representation from K to GL(n, F). For example, the identity projective 
representation from PGL(n, F) to itself cannot be lifted to a linear representation as 
the exact sequence (10.24) does not split. A natural question is “When can we lift a 
projective representation to a linear representation”. This question was first tackled 
by Schur in the beginning of the twentieth century. 


Let A be an abelian group, and G be a group. Then Hom(G, A) is again an abelian 
group with respect to the obvious operation. Let a be a homomorphism from G to 
a group K. Then a induces a homomorphism a* from Hom(K, A) to Hom(G, A) 
defined by a*(n) = joa. 


Proposition 10.3.3 Let 


be an exact sequence of groups, and A be an abelian group. Then the sequence 


0 —> Hom(K,A) 5 Hom(G,A) % Hom(H,A) 


is exact. 


400 10 Group Extensions and Schur Multiplier 


Proof The proof of the proposition is again an imitation of the proof of the Theo- 
rem 7.2.11, and it is left as an exercise. tt 


Remark 10.3.4 (i) As observed in the Remark 7.2.12, the sequence, in general, is 
not exact if we adjoin Oin the right side of the sequence. 
(ii) In the language of category theory, the above proposition is expressed by saying 
that for all abelian groups A, Hom(—, A) is a left exact contra-variant functor from 
the category of groups to the category of abelian groups. 


Let 
a 


fe SG Se Re: Lads (10.25) 


be acentral extension of H by K, and A be an abelian group. Let us denote by H?(K, A) 
the second co-homology group H2(K, A) in case the homomorphism o from K to 
Aut(A) is trivial. We define a homomorphism 6(called a connecting homomorphism) 
from Hom(H, A) to H?(K, A) as follows: 

Let t be a section of (10.25). Since the extension is central, o‘ is trivial in the 
sense that o((h) = h for all x € K andh € H. The function f from K x K to H 
is given by t(x)t(y) = a(f’(x, y))t(xy). Then f* belongs to group Z(K, H) of 2 
co-cycles. Though o‘ is independent of the choice of the section t, f’ depends on 
the choice of t. However, if f’ is another section of the extension, then f’ and f” 
differ by a 2 co-boundary in B?(K, H). Let 7 € Hom(H, A). Then 7 of’ is a map 
from K x K to A. Since 7 is a homomorphism and f" is a 2 co-cycle in Z7(K, H), 
no ft is a2 co-cycle in Z?(K, A). Also no f* and no f* differ by a 2 co-boundary. 
This defines, unambiguously, an element 7 of’ + B?(K, A) in H?(K, A). We define 
6(n) = nof' + B?(K,A). Clearly, 5 defines a homomorphism. 

We have the following fundamental exact sequence associated to the central exten- 
sion (10.25). 


Proposition 10.3.5 For any abelian group A, we have the following natural 
fundamental exact sequence 


0 —> Hom(K, A) _ Hom(G, A) = Hom(H, A) = H?(K,A) --- (10.26) 
associated to the central extension given by (10.25) 


Proof 1n the light of the Proposition 10.3.3, it is sufficient to prove the exactness 
at Hom(H, A). Let y € Hom(G, A). Then by the definition, d(a*(x)) = (yoa)o 
fi + B’(K,A). Now, t(@x)tQ) = a(f'(x, y))t(@y). Hence x(t(@x)) + x(tQ)) = 
xv(a(f'(x, y))) + x(t(xy)). Thus, we have the map g from K to A given by g(x) = 
x(t(x)) with g(e) = Osuch that y(a(f'(x, y))) = gly) — g(xy) + g(x). This shows 
that (yo a) of’ € B?(K, A), and sod0.a* = 0. It follows that imagea* C keré. 
Let 1 € kerd. Then n of’ € B*(K, A). Let g be a map from K to A with g(e) = 0 
such that 


n(f'(x,y)) = gy) — gry) + g(x) +--+: (10.27) 


10.3 Central Extensions, Schur Multiplier 401 


Every element of G is uniquely expressed as a(a)t(x) for unique a € H andx € K. 
Define a map xy from G to A by y(a(a)t(x)) = (a) + g(x). Then 


x(a(a)t(x)a(b)t(y)) 

= x(a(aja(ai(b))a(f' (x, y))t(xy)) 

= x(abf' (x, y)t(xy)) 

n(abf'(x, y)) + g(xy) 

nla) + nb) + nf, y)) + gry) 

= nla) + nb) + g(x) + gQ) (by (10.27)) 
= x(a(at(x)) + x(a(b)tQ)). 


This shows that y is ahomomorphism. Also y(a(a)) = (a) for all a € H. Thus, 
Nn = Xo@ = a*(y). It also follows that Kerd C imagea*. tt 


In particular, we have the following exact sequence. 


0 —> Hom(K,H) > Hom(G,H) % Hom(H,H) % H?(K,H) ---. (10.28) 


The main problem is to determine all central extensions by a group K up to equiva- 
lence. Some author term it as an extension of K instead of by K. But, we shall stick 
to our earlier terminology by calling it an extension by K. 


Definition 10.3.6 A central extension 


Qa 


i ne ae ee 


by K is said to be a free central extension if given any central extension 


ei 
“> K’ > 1, 


Pai 7S 


and a homomorphism 77 from K to K’, there exists a morphism (p, T, 77) (not neces- 
sarily unique) from the extension F to the extension E’. 


Let 


[SRA A esse ; (10.29) 


be a presentation of K,i.e., F is a free group, R anormal subgroup of F, i the inclusion, 
and v a surjective homomorphism with kernel R (note that R is also free by the Neilson 
Schreier theorem). The subgroup [R, F] = < {[r,x] = nrv'x7! | re R,xe F}> 
is anormal subgroup of F contained in R. As such, we get an extension 


1 —> RAR, F)> F/R, F) > KS 1, (10.30) 


by K induced by the presentation (10.29) of K. Clearly, this is a central extension 
by K. 


402 10 Group Extensions and Schur Multiplier 


Proposition 10.3.7 The central extension given by (10.30) is a free central extension 
by K. 


Proof Let 
a 


T > AS G’ Bo K! ae ee a 4 (10.31) 


be acentral extension by K’, and 7 be ahomomorphism from K to K’. Since F is a free 
group, from the projective property of a free group, there is a homomorphism y from 
F toG’ suchthat Boy = nov.Since G(y(R)) = nW(R)) = fe}, y(R) C kerB. By 
the exactness of (10.31), y(R) € a(A’). Again, since a(H’) C Z(G’), it follows that 
(LR, F]) = [y(R), y(P)] © [a(#"), G’] = fe}. Thus, 7 induces a homomorphism 
T from F’/[R, F] to G’ such that G07 = nov. Clearly, G(7(R/[R, F])) = {e}. By 
the exactness, T(R/[R, F]) € a(A’). Since ais injective, it induces a homomorphism 
p from R/[R, F] to H’ such that a o pis T restricted to R/[R, F]. Thus, (p, 7, 7) isa 
morphism from the extension (10.30) to (10.31). tt 


Proposition 10.3.8 Let 


p= (os Oo OS es i, 


be a free central extension. Then the map 6 in the associated fundamental sequence 
described in the Proposition 10.3.5 is surjective. More explicitly, for any abelian 
group A, the sequence 


0 —> Hom(K,A) % Hom(G,A) °> Hom(H,A) > H?(K,A) — 0 


is exact. 


Proof Let 1 be a 2 co-cycle in Z?(K, A). Then (K, A, o, 2) with o the trivial map 
from K to Aut(A) is a factor system. The corresponding associated extension 


F=i—s~aS e@4%"— s 1 


is acentral extension with a section f¢’ such that ¢/(x)t/(y) = a’ (u(x, y))t (xy). Since 
E is a free central extension, we have a homomorphism p from H to A and a homo- 
morphism 7 from G to G’ such that the diagram 


a B 
1 > A »>G _»K -, 1 


10.3 Central Extensions, Schur Multiplier 403 


is commutative. For each x € K, chose a t(x) € G such that r(t(x)) = t/(x). Then 
B(t(x)) = B'(t'(x)) = x. This shows that ¢ is a section of E. Now, t(x)t(y) = 
a(f'(x, y))t(xy), where f" is a 2 co-cycle in Z?(K, H). Further, a/ (u(x, y))t/ (xy) = 
t'(x)t'(y) = T(t(x)) T(t) = T(t(x)t(y)) = T(a(f! 
(x, y))tGy)) = rla(f'(x,y))r@@y)) = r(a(f'(,y)))t’@y). This shows that 
al (p(f' (x, y))) = a’(u(x, y)). Since a’ is injective, p(f‘(x, y)) = p(x, y). By the 
definition, 6(p) = js + B?(K, A). It follows that 6 is surjective. tt 


Next, we try to describe the image of the connecting homomorphism 6 in the 
fundamental exact sequence associated to a central extension under the assumption 
that A is a divisible group. Recall that an abelian group A is called a divisible group 
if for any a € A and an integer n, there is an element b € A such that nb = a. For 
example, (Q, +), (Q/Z, +), (R, +), (C, +), (C*,-), and the circle group (S', -) 
are all divisible groups. From the Corollary 7.2.31, a group D is a divisible group if 
and only if given any subgroup H of an abelian group G, all homomorphisms f from 
H to D are restrictions of homomorphisms from G to D. Equivalently, the functor 
Hom(—, D) from the category of abelian groups to itself takes a short exact sequence 
to a short exact sequence. 


Proposition 10.3.9 Let 


a eee ee ee eB 


be a central extension, and D be a divisible group. Then the image of 6 in the 
fundamental exact sequence 


0 —» Hom(K,D)% Hom(G,D) % Hom(H,D) > H?2(K,D) 


is isomorphic to Hom([(G, G] (| a(H), D). In particular, if the central extension is 
free central extension, then H?(K, D) is isomorphic to Hom([G, G](\ a(A), D). 


Proof By the fundamental theorem of homomorphism, image 6 ~ Hom(H, D)/ 
Kerd © Hom(H, D)/imagea*. The map a induces injective homomorphism @ from 
H/H() a~!({[G, G]) to G/[G, G]. Since D is divisible, @* is a surjective homomor- 
phism from Hom(G/[G, G], D) to Hom(H/H ()\ a7 !({G, G]), D). Also, since D is 
abelian, v* from Hom(G/[G, G], D) to Hom(G, D) is an isomorphism, where v is 
the quotient map. Further the diagram 


art 


Hom(G/(G,G],D) ~~» Hom(H/H ()\a7'!({G, G]), D) 


a | 
ax 


Hom(G,D) ___» Hom(H, D) 


404 10 Group Extensions and Schur Multiplier 


is commutative. It follows that the image of a* is the image of v*. Again, since D is 
divisible, the following sequence is exact: 
0 —> Hom(H/H ()a7'((G, G]), D) “. Hom(H, D) 
+. Hom(H (ol, G]),D) — 1. 
Thus, Hom(H, D)/imageo* is isomorphic to Hom(H () a7!([G, G]), D). Clearly, 
Hom(H () a7 !({G, G]), D) is isomorphic to Hom([G,G]()a(H), D). This 


shows that image 6 is isomorphic to Hom([G, G](\ a(H), D). The last assertion 
follow from the Proposition 10.3.8. tt 


Corollary 10.3.10 Given a central extension 


Pol HS 6S es 1. 


by K, Hom({G, G](\ a(A), C*) is a subgroup of H*(K, C*). tt 


Corollary 10.3.11 Given a presentation 


1 >R>FSOK > I. 


of K, H?(K, C*) is isomorphic to Hom(([F, F](\ R)/[R, F], C’). 


Proof The given presentation induces the free central extension 


iS Riri Pie Se: 


by K. The result follows from Proposition 10.3.9. tt 


Proposition 10.3.12 Let K be a finite group of order n. Then H*(K, C*) is also finite 
abelian group in which order of each element divide n. 


Proof Let f € Z?(K, C*). Then 
fa, why, 2) = FO, DFG, yz) 


for all x, y, z € K. Taking the product of the equation over all z € K, we get 


fey" | [foy.2 = []fo.o] [faa 
zeKk 


zeK zeK 


Define a map g from K to C* by g(x) = [].<x f(x, z). Then g(e) = 1 and the above 
equation reads as 


fa.y)"” = giy)g@y)'g@). 


10.3 Central Extensions, Schur Multiplier 405 


This means that f” is a co-boundary. Hence order of each element of H?(K, C*) 
divides n. Selecting a n' root u(x) of g(x) for each x € K, with u(e) = 1, we geta 
map u from K to C*. Define a map f’ from K x K to C* by 


fy) = fa, yu) luayua)'. 


It follows that f’ + B?(K,C*) = f + B*(K,C*) and f’(x, y)” = 1. Thus, for 
each x, y € K, there are only finitely many possibilities for f’ (x, y). Since K is finite, 
H?(K, C*) is finite. tt 


Corollary 10.3.13 (Schur) Let G be a group such that G/Z(G) is finite. Then the 
commutator subgroup |G, G] of G is finite. 


Proof Suppose thatn =| G/Z(G) |. Then by the Proposition 10.3.12, H?(G/Z(G), 
C*) is finite, and order of each of its element divide n. By the Proposition 10.3.10, 
Hom([G, G]() Z(G), C*) is isomorphic to a subgroup of H?(G/Z(G), C*). Thus, 
Hom({G, G] (| Z(G), C*) is finite, and order of each element of Hom([G, G]() 
Z(G), C*) divides n. If [G,G]() Z(G) contains an element a of infinite order, 
then Hom(< a >, C*) + C* is infinite. Since C* is a divisible group, i* from 
Hom(< a >, C*) to Hom([G, G] () Z(G), C*) is injective. This contradicts the fact 
that Hom([G, G] (| Z(G), C*) is finite. This shows that [G, G](} Z(G) is a tor- 
sion group. Suppose that G is finitely generated. Then Z(G), being a subgroup of 
finite index, is also finitely generated. Hence [G, G] () Z(G) is finitely generated 
torsion abelian group, and so it is finite. Suppose, now, that G is not finitely gen- 
erated. Let S be a transversal to Z(G). Then S is finite. Let L = < S$ > be the 
subgroup generated by S. Then L is finitely generated subgroup of G. Let x, y € G. 
Then there are elements a, b € Z(G) and u,v € S such thatx = auandy = bv. 
But, then [x, y] = [u, v]. This shows that [L, ZL] = [G,G]. Since G = Z(G)L, 
the center Z(L) of L is contained in Z(G). Thus, Z(G)()L = Z(L). Further, 
G/Z(G) = LZ(G)/Z(G) + L/Z(G)(\L = L/Z(L). This means that L/Z(L) 
is finite. It follows from the earlier proved fact that [L, L] is finite. Hence [G, G] is 
finite. t 


Corollary 10.3.14 (Schur—Hopf Formula) Let 


1 »>Ri PAE > |. 


be a free presentation of a finite group K. Then H*(K,C*) is isomorphic to 


([F, F]() R)/IR, Fi. 


Proof By the Corollary 10.3.11, H?(K, C*) is isomorphic to Hom(([F, F] () R)/ 
[R, F], C*). Further, R/[R, F] C Z(F’/[R, F]). Since F/R, being isomorphic to K, 
is finite, it follows that (F'/[R, F])/Z(F/[R, F)) is finite. From Corollary 10.3.13, it 
follows that the commutator [F, F]/[R, F] of F/[R, F]is finite. In turn, ([F, F] 1) R)/ 
[R, F] is finite abelian. Clearly, Hom(Z,,, C*) © Z. Also Hom respects direct sums 
in the sense that Hom(A @ B, C) = Hom(A, C) @ Hom(B, C). Since every finite 


406 10 Group Extensions and Schur Multiplier 


abelian group is direct sum of finite cyclic groups, it follows that for any finite 
abelian group A, Hom(A, C*) ~ A. This shows that for finite groups K, H?(K, C*) 
is isomorphic to ([F, F] (| R)/[R, F]. tt 


It follows from the above result that for a finite group K, the group ([F, F] (| R)/ 
[R, F'] is independent of the choice of a free presentation of K. Indeed, we show 
that for any group (not necessarily finite), it is independent of the choice of a free 
presentation of the group. 


Proposition 10.3.15 Let 


Er= 1 >RSFSK > |. 


be an extension by K representing a free presentation of K, and 


Paice 74.643 5 1 


an extension by L, where i denotes the inclusion map. Let y be a homomorphism from 
K to L. Then there is a homomorphism T from F to G (not necessarily unique) such 
that (T/R, T, y) is a morphism from the extension Ey to E. Further, the morphism 
(7/R, T, y) induces a homomorphism 7 from [F, F]/[R, F] to [G, G]/[H, G] such 
that the diagram 


i» (FFI RR) —» 8 LF.FIRF) = +1K, KI, J 


fo kal 


1 _,» (G,G]()H)/[H,G]_, [G,G)/[H,G] —,[Z,L)_, 1 


al 


is commutative, where the maps i and 7 in the rows are the obvious induced maps, 
while p and 7¥ are the restrictions of T and y respectively. Further, if X is an other 
homomorphism from F to G such that (A/R, A, y) is a morphism from the extension 
Ef to E. Then the induced homomorphism 2 is same as T. 


Proof Since F is free, we have a homomorphism (not necessarily unique) tT from F 
to G such that Go T = yov. Clearly, (r/R, T, y) isa morphism from the extension 
Ey to E. Also tT maps [R, F] to [H, G]. Thus, 7 induces a homomorphism 7 from 
F'/[R, F] to G/[H, G] such that the diagram 


10.3 Central Extensions, Schur Multiplier 407 


i V 

1. RR] KH}. = FAR) —_4 — | 
i p 

Lop, C)) 2» G/F ,G)'...4.28 =e, | 


is commutative, where p is the restriction of 7. Again, since v maps [F, F'] to [K, K], 
(@ maps [G, G] to [L, L], and 7 maps [F, F]/[R, F] to [G, G]/[H, G], the diagram 
in the statement of the proposition is commutative. Suppose that there is an other 
homomorphism A from F to G such that (A/R, A, y) isa morphism from the extension 
Er to E. Then as for 7, \ induces a homomorphism from F'/[R, F] to G/[H, G] 
such that the diagram 


i V 

1 FAR). oH —>}.. FAK, 2). ——_. & =» I 
i p 

1 —» AAA, GG) ow», Gii,G]) —..£ =~. 


is commutative, where 0 is the restriction of \. Let ¥ € F /[R, F]. Then, BOAQ)) — 
y((%)) = B(FH)). Hence \(®) = u(X)T(X) for some u(x) € H/[H, G]. Since 
H/[H, G] is contained in the center of G/[H, G], \([X,y]) = [A@,AG)] = 
[ux)T(x), uy)T(Y)] = [7H&),T7O)] = T(x, y]). This shows that the induced 
homomorphisms \ = 7 when restricted to [F, F]/[R, F], and so also to (([F, F] 
()R)/IR, FI. tt 


Corollary 10.3.16 Given two free presentations 


Er= 1 Re PSE > |. 


and 
Beitae RSP SF => 1, 


of K, the groups ([F, F](\ R)/[R, F] and ([F', F’'](.\ R’)/(R’, F’] are naturally iso- 
morphic. 


Proof From the Proposition 10.3.15, for the identity map Jx from K to K, we have a 
unique homomorphism p from ([F’, F] () R)/[R, F] to (LF’, F'] () R)/[R’, F’] which 
is induced by a morphism (7/R,7,/x) from Ep to Ef and also a unique homo- 
morphism p’ from ([F’, F’] () R’)/LR’, F’] to (LF, F] (1 R)/LR, F] which is induced 


408 10 Group Extensions and Schur Multiplier 


by a morphism (7’/R’, 7’, [x) from Ep to Er. Thus, ((7’ 0 T)/R, 707, Ix), and 
(r/R, Ir, Ix) are both morphisms from FE, to itself and so they induce same homo- 
morphisms from ([F,, F] (| R)/[R, F] to itself. This means that p’ o p is a homomor- 
phism from ([F, F] () R)/[R, F] to itself which is induced by (/r/R, Ir, Ix). Hence 
p’ o pis the identity map on ([F, F] () R)/[R, F]. Similarly p o p’ is the identity map 
on ([F’, F’] (| R’)/[R’, F’]. This shows that p and p’ are isomorphisms. tt 


Since the group ([F, F] (| R)/[R, F] is independent of a particular choice of the 
presentation of K, we have right to have the following definition. 


Definition 10.3.17 Let 


V 


1 5 Ras F > K > |. 


be a free presentation of a group K. Then ([F, F] () R)/[R, F] is called the Schur 
Multiplier of K, and it is denoted by M(K). 


Corollary 10.3.18 For finite groups K, M(K) is finite, and it is isomorphic to 
H?(K, C*). If order of K is n, then the order of each element of M(K) divides n. 


Proof Follows from Corollary 10.3.14 and Proposition 10.3.12. ft 


Corollary 10.3.19 The Schur multiplier M defines a co-variant functor from the 
category of groups to the category of abelian groups. 


Proof Let Ex denote the standard free multiplication presentation of K. More pre- 
cisely, 


Beat 3 Re pee eas 


where Fx is the free group on K, p the unique homomorphism from Fx to 
K induced by [x and Rx the kernel of ys. Then the Schur multiplierM(K) = 
(Rx (\LFr, Fx])/[Rx, Fx]. Let A be a homomorphism from K to L. Then from 
the Proposition 10.3.15, A induces a unique homomorphism M(A) from M(K) = 
(Rx OLFr. Fx])/(Rx, Fr| to M(L) = (Rr OF L. F_J)/(Rz, Fy). Further, if n isa 
homomorphism from the group L to a group U, M(7) o M(A) is the unique homo- 
morphism which is induced by 70 A. Hence M(n 0 A) = M(7) 0 M(A). Clearly, 
M(x) = Imck)- t 


Proposition 10.3.20 Let 


E=1 per x > J 


be afree presentation of a finite group K, where F is free group ona Set {x,, X2,...,Xn} 
consisting of n elements. Then, M(K) = (R()\[F,F])/[R, F] is the finite tor- 
sion subgroup of R/[R, F], and the torsion-free part (R/[R, F])/(R()\LF, F])/ 
[R, F]) * R/(R(][F, F]) is the free abelian of rank n. In turn, R/[R, F] is iso- 
morphic to the direct sum of M(K) and R/(R (\[F, F)). 


10.3 Central Extensions, Schur Multiplier 409 


Proof Since R/[R, F] is contained in the center of F'/[R, F], it is abelian. Also 
F/[F, F]is free abelian of rank n. Further R/(R ()\[F, F]) is isomorphic to R[F, F]/ 
[F, F] which is a subgroup of F'/[F, F]. Since subgroup of a free abelian group is a 
free abelian group, R/(R (LF, F]) is a free abelian group of rank at the most n. Also 
(F/LF, Fl) /(RIF, F\/[F,F]) © F/RUF, F]. Hence (F/[F, F])/(RLF, F\/LF, F)) 
is finite. This means that R[F, F]/[F, F] and so also R/(R ()[F, F]) is free abelian 
of rank n. Next, by the Corollary 10.3.18, M(K) = (R()[F, F])/[R, F] is a finite 
subgroup of R/[R, F] such that (R/[R, F])/(R LF, FD/[R, F)) © R/(ROIF, F)) 
is free abelian. This shows that M(K) is a torsion part of R/[R, F] and R/(R (LF, F]) 
is torsion-free part of R/[R, F']. It also follows that R/[R, F] is direct sum of M(K) 
and R/(R (LF, F)). tt 


Corollary 10.3.21 Let 


G2) = RS PSR eeci 


be afree presentation of a finite group K, where F is free group ona Set {x,, X2,...,Xn} 
consisting of n elements, and R the normal subgroup of F generated as normal 
subgroup by a set {w,, W2,..., w,} consisting of r relators. Suppose that m is the 
minimum number of generators for M(K). Then r > n+ m. Equivalently, any set of 
generators of M(K) contains at least r — n elements. 


Proof Since R is generated as a subgroup by the set {w), w2,..., w,} and its con- 
jugates and w,[R, F] = ww;w'[R, F] for all w € F, it follows that R/[R, F] is 
generated by the set {w)[R, F], w2[R, F], ..., w,-[R, F]}. From Proposition 10.3.20, 
it follows that R/[R,F] is generated by at least n+ m elements. Thus, 
r>n+m. tt 


Corollary 10.3.22 Let K is a finite group having a presentation with generating set 
{X1,X2,---,Xn}, and the set {w,, W2,..., w,} as irreducible set of defining relations. 
Thenr > n.Ifr = n, then the Schur multiplier M(K) is trivial. Further ifr = n+ 1, 
then M(K) is cyclic. Ifr = n+ 2, then M(K) is either a finite cyclic group, or it is 
a p — group which is direct product of two cyclic groups. 


Proof From the above corollary, r > n. If r =n, then the minimum number m for 
generating set of M(K) is 0. Hence M(K) is trivial. Suppose thatr = n-+ 1, then the 
minimum number m for generators of M(K) is at most | and so M(K) is finite cyclic. 
Suppose that r = n + 2. Then the minimum number m for generators for M(K) is at 
most 2. Since M(K) is a finite abelian group, it is a direct product finite cyclic groups 
of prime power orders. Since direct product of cyclic groups of co-prime orders is a 
cyclic group, it follows that M(K) is either cyclic or else it is a p — group which is 
direct product of two cyclic p— groups. tt 


Example 10.3.23 If F is a free group, then 


1 ae eae > 1. 


410 10 Group Extensions and Schur Multiplier 


is a free presentation of F. Hence by the definition M(F) = {0}. In particular, Schur 
multiplier of an infinite cyclic group is trivial. Further, 


Oo) —+ wo 24 BZ, —> 10) 


is a free presentation of Z,,. As such, the Schur multiplier M(Z,,) = {0}. Alter- 
natively, using, Corollary 10.3.18, M(Z,) * H?(Z,,,C*) and then, using, Exam- 
ple 10.2.6, we see that H 2(Zm,C*) = {0}. Using the fundamental theorem of finite 
abelian groups, we can find the Schur multipliers of all finite abelian groups provided 
we have a formula which relates M(A x B), M(A) and M(B) for all finite abelian 
groups. This will follow in sequel. 


Example 10.3.24 Consider the quaternion group Qs. It has a presentation < i, j; i*, 
! ! S Indeed, i* is derivable from the other two relators as follows. 


Py = 
? = iit! = iit! = j-? = i-*. Hence i* = 1. Thus, Qs has a presentation 
<iJ; ri. ii! — jo! > with two generators and two defining relations. As such, 


by Corollary 10.3.22, M(Qg) is trivial. More generally, a generalized quaternion 
group Q4, of order 47 has a presentation < x, y; x2" a a yxy” 'y >. Itis easily seen 
that this is a group of order 4n. Here also x2” is derivable from the other 2 relations 
as follows. We have x” = y* = yy’y~! = yx"y-! = x". Hence x” = e. Thus, 
Q4, 18 a finite group generated by two elements with two defining relations. As such, 


by Corollary 10.3.22, M(Q4,,) is trivial. 
Proposition 10.3.25 Let 


KK > I 


1 RAF 


be a free presentation of a finite group K, where F is a free group of rank n. Suppose 
that L is finite subgroup of R/[R, F] such that the quotient of R/[R, F| modulo L is 
generated by n elements. Then M(K) © L. 


Proof Let A be the torsion part of R/[R, F] and B the torsion-free part of R/[R, F]. 
Then B is free abelian of rank n. Also R/[R, F] © A ® B. Since L is finite, L C A. 
Thus, (R/[R, F])/L © (A/L) © B. If L # A, then (R/[R, F'])/L cannot be generated 
by n elements. This shows that L is the Torsion subgroup of R/[R, F']. The result 
follows from Proposition 10.3.20. tt 


Example 10.3.26 The Dihedral group D,, of order 2n is given by a presentation 
<x,y;x",y’, yxy! = x7! s. This is generated by two elements with three defin- 
ing relations. Thus, M(D2,) has to be cyclic. Every element of D2, has unique rep- 
resentation as xtyl, 0<i<n-—1,0 <j < 1, and so Dy, is a group of order 2n. D2, 
has a free central extension given by 


i= RRP) SS PAR) S Dy = 1, 


10.3 Central Extensions, Schur Multiplier 411 


where F is the free group on {x,y}, and R the subgroup of F generated by 
{x", y, yxy"!x} and its conjugates. As already described in Proposition 10.3.20, 
R/[R, F]isinthe center of F/[R, F]. The torsion-free part of R/[R, F] is a free abelian 
group of rank 2, and its torsion part is M(D2,). Since uwu~'[R, F] = w[R, F] for 
all u € F and w € R, it follows that R/[R, F] is an abelian subgroup of F’/[R, F] 
which is generated by {x”[R, F], y"[R, F], yxy !x[R, F]}. Denote x”[R, F], y*[R, F| 
and yaya, F] by a, b and c respectively. Then R/[R, F] is generated by {a, b, c}. 
Now, a = x"[R, F] = yx"y7![R, F] = Oxy !)"[R, F] = (yxy lex!" [R, F] = 
(yxy lex! LR, FJ)" = Oxy! x[R, F]@7'[R, F])" = Gxya!x] R, F\)"x""[R, F] = 
c"a~'. Thus a” = c". Suppose that n = 2m is even. Putd = ane ™ Then d? = 

Suppose thatd = e.Thena = c” andsox"(yxy~!x)~” € [R, F]. Since [R, F] C z 
it follows that x” is derivable from the relation yxy~!x. But the group given by a pre- 
sentation < x, y; y*, yxy"! = x7! > is infinite Dihedral group. Hence dis an element 
of order 2 in R/[R, F]. Also the quotient group of R/[R, F] modulo the subgroup 
< d > generated by d is generated by {b < d >,c < d >}. It follows from the above 
proposition that < d > is the torsion part of R/[R, F]. This shows that M(D4,,) is 
the cyclic group of order 2. 

Next, suppose that n = 2m + 1 is odd. Now, putd = ac~™.Thend* = c. Hence 
a,c €< d >. Thus, in this case, R/[R, F] is generated by {b, d}. Already, R/[R, F'] is 
direct sum of M (D442) with a free abelian group of rank 2. If M(D4,,+2) is nontrivial, 
R/[R, F] can not be generated by two elements. Hence M(D4,,+2) is trivial. 


Example 10.3.27 Consider the group G having presentation 


gag 453 2 
<xX,y3©,y, Cy) > 


If we take x = (12345) andy = (152) in As, orx = cal andy = | ° | 


in PSL(2,5), then x» = y>? = (xy)? represent the identities in the respective 
groups. Also they generate the respective groups. As such there is a surjective 
homomorphism from G to As, and also a surjective homomorphism from G to 
PSL(2, 5). Using the coset enumeration method of Coxeter and Todd, one finds 
that the order of G is 60 which is same as that of As, and also that of PSL(2, 5). 
It follows that < x, y; x, y>, (xy)? > is presentation of As, and also a presentation 
of PSL(2, 5). It also turns out that As is isomorphic to PSL(2, 5). Let F denote the 
free group on {x, y} and R the normal subgroup of F generated by {x>, y*, (xy)7}. 
To find the Schur multiplier, we need to find the Torsion part of R/[R, F]. Put 
a = x[R,F], b = vik: Fj andc = (xy)? LR, F). Then R/[R, F] is a central 
subgroup of F’/[R, F] generated by {a, b, c}. We find relations between a, b, and 
c in R/[R, F]. Indeed, we show that Co = a!2b°. Since c is in the center of 
F/[R, F], c? = (ay)[R, Fl@y) IR, F] = x(xy)yxy[R, F] = x? yxy*xy[R, FI. 
Again inserting (xy)? between x? and y in the above expression, we find that 
oC = xy(xy’)*xy[R, F]. Repeating this process by putting (xy) in between x? and 
y and again in turn putting (xy)* in between x* and y in the resulting expression, we 
find thatc> = x° y(xy?)*xy[R, Fl) = ay(xy’)*xy[R, F]. Since c and a commute with 


412 10 Group Extensions and Schur Multiplier 


all the elements of F/[R, F], a~'!cy~![R, F] = a~!y7![R, Fle? = (xy*)*xy[R, F]. 
Hence 
& = alxy’) IR, F] 


Putting again (xy)? in between x and y” in the above expression, we get c!? = 
a(x? yxy [R, FIP = av" [R, F})° (x7yx[R, F]P = ab? (x*yx[R, Fl? =ab 
(x2y(x3y)*x[R, F]. Further, since c, a, b commute with all elements of F/[R, F], 
we have (x*y(x*y)*[R, F] = y~!x73(ab*)~!c!9. In turn 


ce” = ab Go yIR, FI) 


Again, iterating the same procedure, i.e., putting c in between x? and y in the above 
expression, we find 


ch = ab (x*yx"[R, F])° 


Iterating finally, we get 


eo = gitp” 

Thus, if we put p = a®°b!%c7!5, then p? is the identity of R/[R, F]. Also if we put 
q = @ bc andr = ab*c—3, then R/[R, F] =< a,b,c > =< p,g,r >. Hence 
the quotient group (R/[R, F])/ < p > is generated by two elements. Since torsion- 
free part of R/[R, F'] is a free abelian group of rank 2, < p > is the torsion part of 
R/[R, F]. This shows that the Schur multiplier of A5(PSL(2, 5)) is a group of order 
at most 2. Further, we have a central extension 


a oe SEQ,5) <> PSL, 5). + 4, 


where A is the center of SL(2, 5). Thus A is the group {aly | a € ZE, a2 = l= 
{Iz, —In}. This means that A is a group of order 2. Since SL(2, 5) is a perfect group, 
by the Proposition 10.3.9, Hom(A, C*) © A is a subgroup of the Schur multiplier 
H?(PSL(2, 5), C*) of PSL(2, 5). Thus, the Schur multiplier M(PSL(2, 5)) * M(As) 
is a group of order 2. 


Five-Term Exact Sequence 
For convenience, in all the group extensions of the type 


(<3 POG 2s Lb 


a will be treated as inclusion map i. Thus, H is treated as a normal subgroup of G. 
Needless to say that there is no loss of generality. 


10.3 Central Extensions, Schur Multiplier 413 


Theorem 10.3.28 To every group extension E given by 


E=1 wet os x > I, 


there is an associated connecting homomorphism 6(E) from M(K) to H/[H, G], and 
in turn the five-term exact sequence 


ua) 2 ue S WE Gl > Ga S Kn — 1, 


which is natural in the sense that given any extension E’ of H’ by K', and a morphism 
(u/H, 1, v) from E to E’, the diagram 


MQ OM 2 A/|H, G| _ Gap a Kap —— il 


is commutative, where Gap = G/[G,G] and K/[K, K] are the abelianizers of G 


and K respectively. 


Proof By the Corollaries 10.3.16 and 10.3.17, M is a functor from the category of 
groups to the category of abelian groups. Thus, it is sufficient to establish the five-term 
exact sequence with a choice of a free presentation of G and that of K. Let 


1 >R5FA6G > I, 


be a free presentation of G. Let S = p~'(H). Then R C S and we have a free 
presentation 


i Bow 


1 >S > FS K > |, 


of K. Clearly, jz takes S to H, and indeed, [S, F] to [H, G]. In turn, we have a natural 
map 0(E) from M(K) © (S ()[F, F])/[S, F] to H/[H, G] given by 0(E)(s[S, F]) = 
L(s)|H, G]. This gives us a five-term sequence 


may" man © ne G) > Ga 4 Kee 1, 


We prove the exactness of the sequence. Since (3 is surjective, (3 is surjective. Again, 
since (oi is the trivial map, image i C kerB. Suppose that BIG, G]) = [K, K]. 
Then G(g) € [K, K]. Hence there is a u € [G, G] such that G(g) = G(w). In turn, 
there isah € H suchthatg = hu. Clearly, i(A[H, G]) = g|G, G]. This proves the 
exactness at Gap. 


414 10 Group Extensions and Schur Multiplier 
Next, since u(s) € H ()[G, G] for all s € S()[F, F], it follows that 
i(6(E)(s[S, F])) = i(u(s)[H, Gl) = ws)[G, G] = [G, G]. 


This shows that imaged(E) C keri. Let h[H, G] € keri. Then h € H (\LG, G]. This 
means that h € u(S()[F, F]). It follows that h[H, G] € imaged(E). Finally, we 
prove the exactness at M(K). Let r[R, F] € M(G), r € R()[F, F]. Then by defini- 
tion, d(E)(M(8)(r[R, F])) = O(E)(7IS, F]) = wO)[H, G] = [H, G], for u(r) = 
0. This shows that imageM (3) C kerd(E). Further, suppose that 6(E£)(s[S, F']) p(s) 
[H,G] = [H, G], where s € S()[F, F]. Then p(s) € [H, G] = p((S, F]). Hence 
there is at € [S, F] such that p(s) = (ft). In turn, p(st-!) = e. Thus, s = rt for 
some r € R. But, then s[S, F] = rt[S, F] = r[S, F] = M(@)(r[R, F)). It follows 
that imageM(3) = ker(6(E)). tt 


We give another interpretation of the group M(G) as the group of commutator 
relations. For any group G, the commutator operation [x, y] = xyx !y~! can be 
easily seen to satisfy the following relations called the trivial commutator relations. 
(i) [x, x] = e. 

Gi) [x, y]ly, x] = e. 
(iii) [xyx7!, xex7"]L, z]lz, xy] = e. 
(iv) [xyx7!, xex7"]z, yIbyzy'z71, x] = e. 

The group M(G) can be viewed as the group of nontrivial commutator rela- 
tions in G. More precisely, consider the free group F(X) on X, where X = 
GxG — (Gx {e}U{e} x G). We identify (x,e) and (e,x) with identity of 
F(X). From the universal property of a free group, we have a unique homomor- 
phism 7 from F(X) to G given by n(x, y)) = [x,y] = xyx ly. Let Q(G) 
denote the normal subgroup of F(X) generated by the set of elements of the 
types (x, x), (x, y)(y, x), Gryx!, xex~")(x, ZZ, xy) and (xyx7!, xzx7")(Z, y) zy! 
z_!, x). Clearly, 2(G) is contained in the kernel of 7. As such, it induces a unique 
homomorphism denoted, again, by 7) from F(X) /&2(G) on to the commutator sub- 
group [G, G] of G. The proof of the following theorem involves some computations 
which we leave and refer to the book “Schur Multiplier” by Karpilovski for the details 
of the computations. 


Theorem 10.3.29 The kernel of the above-described map 1 is the Schur multiplier 
M(G), and we have the natural short exact sequence 

1 —> M(G) ES F(X)/Q(G) 4 [G,G] — 1. tt 

Thus, M(G) can be viewed as the group of commutator relations in G modulo 


the trivial commutator relations. 


Definition 10.3.30 The group F(X)/@(G) introduced above is called the non- 
abelian exterior power of G, and it is denoted by G A G. 


10.3 Central Extensions, Schur Multiplier 415 


Corollary 10.3.31 If G is an abelian group, then M(G) ¥ GAG. tt 


Tensor Product and Exterior Power of Groups 
Let K and L be groups. Let G be another group. A map 7 from K x L to G is 
called a bi-multiplicative map, if 


(i) (kk, 1) = nk, Dn(k’, 1) and 
Gi) nk) = nk, Dink, 1) 


for allk, k’ € K and J, /' € L. Note that if 7 is a bi-multiplicative map, then 7(e, /) = 
e = 7(k,e) forallk € K and/ e€ L. 


Proposition 10.3.32 For any pair of groups K and L, there is a pair (K ®@ L, »), 
where K @ L is a group with 7 a bi-multiplicative map from K x L to K ® L which 
is universal in the sense that for any pair (G', 1) with 7 a bi-multiplicative map 
from K x L to G’, there is a unique homomorphism from K ® L to G' such that 
hon = 1. 


Proof Take K ® Ltobe the group with presentation < X; R >, where the generating 
setX = {(k,l)€KxL | k#eH 1} and the set R of relators is given by R = 
{(kk’, DUK, DYCK, DT, (KEK ID) 1K, I)! | kk EK and 1,1 € L}. 
Thus, K @L = F(X)/H, where H is the normal subgroup generated by R. The 
map 77 is given by 7(k, 1) = (k,1)H. We denote 7(k, 1) by k @ I. Clearly, 7 is a bi- 
multiplicative map, that is, kk’ @1 = (k @1)(k’ @ )Dandalsok @ ll! = (kK@D)kK® 
I’). Let (G’, 7’) with 7 a bi-multiplicative map from K x L to G’ be an other pair. 
From the universal property of free group, we have a unique homomorphism y from 
F(X) to G’ such that x(k, 1) = 7(k, 1). The supposition that 77’ a bi-multiplicative 
map from K x L to G’ ensures that x takes the relators R to e. This means that H 
is contained in the kernel of Kery. By the fundamental theorem of homomorphism, 
x induces a unique homomorphism ys from K @ L to G’ such that p(n(k,1)) = 
u(k, DH) = x(k) = (kD. t 


Proposition 10.3.33 The pair (K ® L,) introduced above is unique in the sense 
that if (G’, 7’) is another such pair, then there is a unique isomorphism from K ® L 
to G’ such that 4on = 1. 


Proof From the universal property of the pair (K @ L, 7) established in the above 
proposition, there is a unique homomorphism jz from K ® L to G’ such that 207 = 
7’. Since the pair (G’, 77’) is also assumed to have the same universal property, there 
is a unique homomorphism v from G’ to K ®@ L such that von! = 7. Thus, vo u 
and xg, are both homomorphisms from K ® L to itself such that (vo p)on = 
7 = Ixa_o7. From the universal property of the pair (K @L,7),vop = Ikax. 
Reversing the role of (K ® L, 7) and (G’, 7’), we get that pov = Ig. Thus, pu is 
an isomorphism with the required property. tt 


Definition 10.3.34 The pair (K @ L, 17) is called the tensor product of K and L. 
By the abuse of language we also say that K ® L is a tensor product of K and L. The 
image 7((k, /)) is denoted by k @ I. 


416 10 Group Extensions and Schur Multiplier 


Thus, (i) (kk' @2) = (kK@D(k @D, Gi) KOU’) = (K@D(kK@/’). In turn, 
e®l=e=k@eandk'!@l = (k@l'! =kelr"'. 


Proposition 10.3.35 Let 1 be a bi-multiplicative map from K x L to G. Then the 
image n(K x L) of n generates an abelian subgroup of G. 


Proof Fork, k’ € K andJ,I' € L, 

(kk, WW’) = (kk! Diy (kk) = nk, Dik’, Din k, Fk’, UY 
On the other hand, 

KK’, I) = n(k, Wn kW) = nk, Dk, Pynk, Dk’, 1). 


Comparing, 7(k’, )n(k,l’) = nk, l)n(k', 2 for all k,k’ € K and 1, I’ € L. This 
means that the elements of 7(K x L) commute pairwise. tt 


Corollary 10.3.36 The tensor product K ® L of any two group K and L is abelian 
group, and it is isomorphic to Kap ® Lap, where Kap denote the abelianizer K/(K, K| 
of K. 


Proof Since K ® Lis generated by {k @1 | k € K,1 € L}, it follows from the above 
proposition that K @ L is abelian. Let us denote the coset k[K, K] by k. Define a map 
n from K x L to Kap ® Lay by n(k, 1) = k @1. Evidently, 17 is a bi-multiplicative 
map. As such, it induces a unique homomorphism 77 from K @ L to Kap ® Lap subject 
to 7(k @1) = k@l1 which is clearly surjective. We show that it is bijective by 
constructing its inverse. Now, [k, k’] @1 = kk’k'k’ | @1 = (k@D(k @D(k'!@ 
D(k’! @1) = eforallk, k’ € K and/ € L. Since every element of [K, K] is product 
of commutators, and taking tensor product is bi-multiplicative, it follows that u @ 
1 = e for all u € [K, K]. Similarly, k ®v = e for all ke K and /e€[L,L]. It 
follows that k = k’ implies thatk@J = k’ @/ for all] €L and also/ = I’ 
implies thatk @/1 = k@I' for allk € K. Thus,k = k’ and/ = I’ implies that 
k@l1 = k' @I'. This ensures that we have a map y from K,, x Lz, to K @ L defined 
by y(k,J) = k @1. Clearly, y is a bi-multiplicative map, and as such it induces a 
homomorphism X from K,, ® Lap to K @ L given by ¥(k @1) = k @1. Clearly, ¥ 
is inverse of 7. tt 


It is evident from the above corollary that the theory of tensor product of groups 
reduces to the theory tensor product of abelian groups through their abelianizers. As 
such, we state few results which follow from the corresponding results on the tensor 
products of abelian groups (modules over Z) (refer to the Chap.7 of the book). 


Corollary 10.3.37 K ® L is isomorphic to L ® K. tt 


Proposition 10.3.38 Let 


10.3 Central Extensions, Schur Multiplier 417 


be an exact sequence, and L be a group. Then the sequence 


fer! Ger. = ret — 4 


is exact. tt 


Proposition 10.3.39 Let H, K, and L be groups. Then (H @ K) ® L is naturally 
isomorphic to (H @ L) ® (K @L). tt 


Proposition 10.3.40 LetH, K, andL be groups. There is a tautological isomorphism 
from (H ® K) ®Lto H ® (K @L) which maps (h®k) @ltoh® (k @l). tt 


Proposition 10.3.41 Let H, K, and L be groups with L being abelian. Then there is 
a natural isomorphism from Hom(H, Hom(K, L)) to Hom(H ® K, L). tt 


We, further state few results without proof which can be used for some com- 
putations. For the proof we refer to the book “Schur Multiplier” by Karpilovski. 


Proposition 10.3.42 Let H and K be groups. Then M(H @ K) © M(A) © M(K) 
@ (H @ K). t 


Thus, for finitely generated abelian group, M(A @ B) © A @ B, and so the Schur 
multiplier of a finitely generated abelian group is easily determined. For free products, 
we have the following. 


Proposition 10.3.43 M(H « K) ~ M(H) ® M(K). tt 
Exercises 


10.3.1 Let K be asubgroup of a group G, and A be an abelian group which is a trivial 
G-module. Show that we have a homomorphism res(g,x) from H 2(G, A) to H?(K, A) 
given by resig.n(f + B°(G,A)) = f/K x K + B?(K,A). The homomorphism 
res(q,k) 18 called the restriction homomorphism from G to K. Observe that if L is a 
subgroup of K, then res(x,1)0 res(g,K) = Tres(G,L)- 


10.3.2 Let K be asubgroup of a group G of finite index n. Let = {e = x), .%,...,Xp} 
be a right transversal to K in G. Given any element g € G, for each x; € S, there 
is a unique element o,,(g) € K, and a unique element x; * g € S such that xjg = 
x,(g)xi * g. Let f be a 2 co-cycle in Z7(K, A), where A is a trivial K-module. Show 
that the map f from G x G to A defined by 


n 


[ [fu ox, 69)) 


i=1 


is a 2 co-cycle in Z?(G, A). Show also that f € B?(K, A) implies that f € B(G,A). 
Deduce that we have a co-restriction homomorphism cores (xg) from H?(K, A) 
to H?(G, A) given by cores(x.g)(f + B?(K,A)) = fa: B?(G, A). Show also that 
cores(K,G)O COreS(L,.K) = COres(L.G)- 


418 10 Group Extensions and Schur Multiplier 


10.3.3 Let a = f + B’(K,A) be an element of H*(G, A). Show that (cores (K,G) 
0 reS(g,K))(a) = a”, where K is a subgroup of index n. 


10.3.4 Let K be a normal subgroup of G of index n, and a € H?(K, A). Show that 


(res(g,KyO cores(x,g))(a) = a". 


10.3.5 Use the fact that the group IR* has only one nonidentity element —1 of finite 
order to show that H*(G, R*) ~ H?(G, Zz) for any finite group G. 


10.3.6 Let G be a finite group, and D be a divisible group. Show that H7(G, D) = 
H?(G,T(D)), where T(D) is the torsion part of D. Deduce that M(G) = H? 
(G, Q/Z) for all finite groups G. 


10.3.7 Compute Qg A Qs, and hence also M(Qs3). 


10.3.8 Use the five-term exact sequence associated to the extension 


to show that M(Qg) = {0}. 


10.3.9 Compute P A P, where P is a non-abelian group of order p>, p a prime. Hence 
compute M(P). 


10.3.10 Compute the Schur multiplier of a non-abelian group of order pq, p, and q 
are primes. 


10.3.11 Find the Schur multipliers of Ay, and also of S4. 


10.3.12 Let G be a finite nilpotent group. Show that the Schur multiplier of G is 
the direct products of Schur multipliers of its Sylow subgroups. 


10.3.13 Let m,n, r be positive integers such that r”? = l(mod m), (m,n) = 1 = 
(n, r — 1). Let G be a group having a presentation < {x,y}; | x” = 1 = y" = 
y !xy-" >. Using the Tietze transformation (see Algebra 1), reduce the number of 


relators to 2, and then show that M(G) is trivial. 


10.3.14 Show that there is a surjective homomorphism from M(GL(R)) to 
M(K\(R)). 


10.4 Lower K-Theory Revisited 


In Chap. 7, Sect. 7.4, we introduced the Grothendieck group Ko (R) and the Whitehead 
group K,(R) of a ring R. Recall that K,;(R) = GL(R)/E(R), where E(R) is the 
group generated by the elementary matrices. By the Whitehead lemma, E(R) is 
the commutator subgroup [GL(R), GL(R)] of GL(R). In this section, we introduce 


10.4 Lower K-Theory Revisited 419 


Milnor group K2(R) of a ring R which can be viewed in two ways: (i) The Schur 
multiplier M(E(R)) of the group of commutator relations among the elements of 
E(R) modulo the trivial commutator relations, and (ii) the group of relations among 
the transvections E : modulo the group of trivial relations, viz., the group of Steinberg 
relations. We describe it in detail. 

Note: Usually, in the literature, an extension 


Pa) 4726S —1 


is termed as extension of K, but we shall adhere to our terminology by calling it 
extension by K. 


Proposition 10.4.1 Let K be a perfect group in the sense that |K, K] = K, and 


Pai = FS 6S FS i 


be a central extension by K. Then the commutator subgroup [G, G] is perfect, and 
i B 
— H{ IG, G] > (GG) SK —s 1 


is also a central extension by K. 


Proof Since K is perfect, G([G, G]) = [K,K] = K. Thus, given any element 
aeé G, there is an element u € [G, G] such that G(a) = ((u). This means that 
every element in G is of the type hu for some h € H andu €[G, G]. Leta = hu, 
b = hw’, whereh, h' € H C Z(G), u, uw’ € [G, G] be arbitrary elements of G. Then 
[a,b] = [u,u’] € [[G, G], [G, G]]. This shows that [G,G] © [[G, G], [G, G]]. 


Hence [G, G] is perfect. The rest is evident. tt 
Proposition 10.4.2 Let 


E=1 ee ee > 1 


be a central extension by K, where G is perfect. Let \ and 1: be a homomorphisms 
from G to G’ inducing the morphisms (A/H, X, Ix) and (u/H, 1, Ix) from E to a 
central extension E" given by 


Psi swe 42ers i 


Then \ = w. 


Proof Since G is perfect, the commutators generate G. Thus, it is sufficient to show 
that A([a, b]) = (la, b]) for all a,b € G. For each x € G, BIA) = B(u(x)), 
and so there is an element u(x) € H such that A(x) = u(x) u(x). Now, A([a, b]) = 
[A(@), A) = [u@p(a), ub) u(b)] = [v(@), w(b)] = bla, 5). o 


420 10 Group Extensions and Schur Multiplier 
Definition 10.4.3 A central extension 


Q, =l1—- ASUS KS 1 


is called a universal central extension by K if given any central extension 


a B 


E=1—-Ll-Gaok—1 


by K, there is a unique homomorphism ¢ from U to G inducing a morphism (€, ¢, [x) 
from Qx to E. 


Proposition 10.4.4 Universal central extension by K is unique up to equivalence. 
Proof Let 


oO. = Sr Se ae 


be another universal central extension by K. Then there is a unique homomorphism 
¢ from U to U’ inducing a morphism (€, ¢, /x) from Qx to oi and there is a unique 
homomorphism ¢’ from U’ to U inducing a morphism (€’, ¢’, /x) from o to Qk. 
But, then we have homomorphisms ¢’o¢ and Jy inducing morphisms (€'0€, d’0¢, Ix) 
and (y, ly, Ix) respectively. From the universal property of Qg, d'0¢@ = Ty. Sim- 
ilarly, using the universal property of On god’ = Iy. This shows that Qx is 
equivalent to Ge: tt 


Proposition 10.4.5 /f 


Qe =1 >HSUSBK > | 
is a universal central extension, then U is perfect (U = [U,U]). 


Proof Suppose that U is not perfect. Then U/[U, U] is a nontrivial abelian group. 
Consider the direct product extension 


i 6 PA owe ek Se ed 


by K, where i, is the first inclusion, and p2 is the second projection. Clearly, 
this is a central extension by K. Further, the map (v, @) defined by (v, 8)(u) = 
(u[U, U], G(u)), and the map(0, 2) defined by (0, 3)(u) = ([U, U], G(u)) are two 
distinct homomorphisms from U to U/[U, U] x K which induce morphism from 
Qx to the given direct product extension. Hence (2x can not be a universal central 
extension. tt 


Since homomorphic image of a perfect group is a perfect group, we have 


10.4 Lower K-Theory Revisited 421 

Corollary 10.4.6 [fK admits a universal central extension by K, then K is a perfect 

group. tt 
Conversely, 


Proposition 10.4.7 Every perfect group K admits (of course, a unique) universal 
central extension by K. 


Proof Suppose that K is perfect. Let 


R= ase RS Pe as i 


be a free presentation of K. We have a central extension 


Ex = 1 —> R/[R,F]—> F/[R,F] 6 K —> 1 


by K. Since K is perfect, [F/[R, F], F/[R, F]] = [F, F]/[R, F] is perfect (by the 
Proposition 10.4.1), and we have a central extension 


Qe =1 — RIF, FIR. FI LF, FI/(R, FI SK > 1 


by K. We prove that this is a universal central extension by K. Let 


E=1 7s es x > | 


be a central extension by K. Since F is free, there is a homomorphism ¢ from F to 
G inducing a morphism (¢/R, ¢, [x) from Fx to E. Since E is a central extension 
by K, it induces a morphism (¢/R, d, Ix) from Ex to E, which in turn, induces a 
morphism from Qx to E. Since [F, F']/[R, F] is perfect, by the Proposition 10.4.2 
such a morphism is unique. ft 


Corollary 10.4.8 If K is perfect, then K A K is also a perfect group, and it is the 
universal central extension of M(K) by K. More precisely, 

ik RS RAR Se ae i 
is a universal central extension, where c is the commutator map given by c(x Ay) = 
Lx, y]. t 


Proposition 10.4.9 A central extension 


Qe! sH+ uAE > I 


by K is a universal central extension by K if and only if U is perfect and every central 
extension by U splits. 


422 10 Group Extensions and Schur Multiplier 


Proof Suppose that given extension Q is a universal central extension. Then by 
Proposition 10.4.5, U is perfect. Let 


E=1 Les Gg > | 


be a central extension by U. Then consider the extension 


PSi—s iis. eo n.34 


by K. We first show that this is a central extension by K. Let g € ker((00). We 
need to show that g € Z(G). Since (G(d(g))) = e, 6(g) €kerB = HCZ(U). 
Let x € G. Since 6(g) € Z(U), d(xgx7!) = 6(x)d5(g)d(x)~! = e. This shows that 
xgx-' € L C Z(G). Again, since Z(G) is a characteristic subgroup of G, it follows 
that g € Z(G). In turn, it follows that E’ is a central extension by K. Since Q is 
universal central extension, there is a unique homomorphism ¢ from U to G such 
that (G0d)o¢ = (3. This shows that (d0@) and Jy are homomorphisms from U to U 
which induce morphisms from © to itself. Since Q is universal central extension, it 
follows that (60¢) = Iy. This shows that F is split exact sequence. 

Conversely, suppose that U is perfect and every central extension by U splits. Let 


re ee a a er 


be a central extension by K. In the light of the Proposition 10.4.2, it is sufficient to 
show the existence of a homomorphism 7) from U to G inducing a morphism from 
Q to E”. Consider the subgroup U xx G = {(u,g) | G(u) = 4(g)} of U x G. We 
have the extension 


B=) FS ees Fe 1 


which is clearly a central extension by U. From our hypothesis, the sequence splits. 
Let ¢ be an splitting. Then there is a homomorphism ¢@ from U to G such that 
t(u) = (u, d(u)) € U Xx G. But, then G(d(u)) = u. Thus, ¢ induces a morphism 
from Q to E”. tt 


Let R be a commutative ring. Recall that the group E(R) is perfect. As such, we 
have the universal central extension 


1 —> M(E(R)) > E(R) A E(R) S E(R) = 1 


by E(R). The group E(R) A E(R) represent the group of commutator relations in 
the group E(R) modulo the trivial commutator relations. We shall have another 
interpretation of this group. 

We have the following definition. 


10.4 Lower K-Theory Revisited 423 


Definition 10.4.10 The group M(E(R)) is called the Milnor group of the ring R, 
and it is denoted by K2(R). 


We shall have another way to see the group K>(R). If f is ahomomorphism from a 
ring R to aring R’, then f induces a natural homomorphism E(f) from E(R) to E(R’) 
given by E(f)[aj] = [by], where by = f (aj). Clearly, E(gof) = E(g)oE(f) and 
E(Ip) = Ir). Further, since M defines a functor from the category of groups to the 
category of abelian groups, it follows that K is a functor from the category of rings 
to the category of abelian groups in the sense that if f is a homomorphism from a 
ring R to R’, it induces a homomorphism K(f) from K2(R) to K(R’) such that (i) 
Ko(gof) = K2(g)oK2(f) and (ii) Kor) = Ik, r)- 
The following natural exact sequence relates K, and K functors. 


1 —> Ky(R) > E(R)AE(R) % GL(R) 5 K\(R) — 1, 


where c represents the commutator map given by c(x Ay) = [x, y]. 
Recall that the n x n,n > 3 elementary matrices E} with entries in a ring R with 
identity satisfy the following relations termed as Steinberg relations. 


(i) BEY = BOTH, 


yy Yy 
(ii) [E}, El = = I, fori Alandj <k. 
Gi) [ES EA) = Bae. 
7 dr [L — pr 
(iv) (EA, EE] = Ey 


For each n > 3, let St(n, R) denote the group generated by the set 
{x | l<isnl<j<n,r\€R} 


subject to the relations 


(i) xy Axi = a 
(ii) ial = = efori~Alandj Fk. 
(iii) [x ‘ Se, ee. 
Gy) ean = ag 
For each n, we have the natural surjective homomorphism ¢, from St(n, R) to E(n, R). 
Clearly, St(n, R) is a subgroup of St(n + 1, R) in a natural way, and we have a chain 


St(3, R) € St(4, R) C «++ © Stn, R) € St(n $1, R) Cees 


of groups. The union St(R) of the chain is a group called the Steinberg group 
of the ring R. Note that the maps ¢, respect the inclusion maps in the sense that 
in0Qn = Gn+10in, Where i, are the respective inclusion maps. In turn, in limit, ¢, 
induces a surjective homomorphism ¢ from St(R) to E(R). 


424 10 Group Extensions and Schur Multiplier 


Theorem 10.4.11 The short exact sequence 
1 — Ker > St(R) > E(R) — 1 


is a universal central extension by E(R). 
Before proceeding to prove the above theorem, let us have a corollary. 


Corollary 10.4.12 We have natural isomorphisms keré ~ K2(R) ~ M(R) and 
St(R) © E(R) A E(R). t 


Thus, K2(R) can be viewed as the group of nontrivial relations satisfied by the 
elementary matrices E}. Further, we have the exact sequence 


@ 


1 —> K>(R) —> St(R) & GL(R) & K\(R) — 1. 


Lemma 10.4.13 Kerd is the center Z(St(R)) of St(R). 


Proof Let a € Z(St(R)). Then ¢(a) € Z(E(R)). Since E(R) is center less (a matrix 
commutes with all elementary matrices if and only if it is a scalar matrix), p(a) is 
identity. This means that a € Ker@. Thus, Z(St(R)) C Kerd. Suppose that ¢(a) = e. 
We need to shows that a € Z Hh), Let C,, denote the subgroup of St(R) generated 
by the elements of the type x? jj» Where i # n # j. Clearly, there isan € N such that 
a eéC,. Fixan € N such nate € C,. Let X,, denote the subgroup of St(R) generated 
by the elements of the types x, where i ¥ n. Since any two such elements commute 
(see the Steinberg relations), X,, is abelian. Further, any nonidentity element x of X,, 
is expressible as 


x= Ba eee ip <ip< ct: <i, 
where i, #n. Now, (x) = E? ie ae -E ‘is the matrix all of whose diagonal 


entries are 1, i row and n” column entry is Ax, k <r and the rest of the entries 
are 0. This shows that the representation of an element x as given above is unique, 
and ¢ is an injective homomorphism when restricted to X;,. Suny, let Y,, denote 
the subgroup of St(R) generated by the elements of the type ap where j € n. Then 


Y,, is also abelian, and ¢ restricted to Y,, is injective. Consider x, where i An 4j. 


: f . 
Then for any xy, € Xn,X xi i A is in ‘xy, if j = k and xj, otherwise. This means that 


ee al C X,,. Similarly, x7; a ‘Cc Y,,. It follows C, is contained in the normalizer 
of X,, as well as in the fevie alieei of Y,. Thus, aX,a~! C X,, and also aY,a7! C Y,. 
Letu € X,. Then d(aua~'!) = d(a)d(u)d(a)~! = (wu). Since phi is injective when 
restricted to X,,, it follows that aua~! = uw for all u € X,. Thus, a commutes with 
each element of X,,. Similarly, a commutes with all elements of Y,,. It follows that 


a commutes with oe and also with ae ' for all i An Fj. Consider ie where k #n 


and! £n. Thenx, = [x?,,x/4], atid so a commutes with x7, also. This means that 


a commutes with all the generators of St(R). Hence a € Z(St(R)). t 


10.4 Lower K-Theory Revisited 425 


In the light of the Proposition 10.4.9, to complete the proof of the Theorem 10.4.11, 
it is sufficient to establish the following Lemma. 


Lemma 10.4.14 Every central extension by St(R) splits. 


Proof Let 
l—~ 7s GS sR 1 


be a central extension by St(R). To show the existence of a splitting, it is sufficient 
to show the existence of a set {s; €G | ER} of elements of G which satisfy 
the Steinberg relations and € (s}) — x}. Let ¢ be a section of the extension. Then 
E([t(x4), tx) = (GG). E(ty)] = [ee xe) = x}, where i, j, k are distinct. 
Further, if t’ is another section of the extension, then 13) — unt’ (x4) for some 
u\ € H. Since H € Z(G), it follows that [t(x}), ta) = [1 (x4), t(x;;)]. Thus, 
[t(x), tx4,)] is independent of the choice of a section. In fact, using trivial commu- 
tator relations, and observing the fact that [1(x}, t(xp) tx}, txt 1)! € Z(G) for 
alli, 7 #k,1, p # qandalso [t(x}., th tog, th) € Z(G) foralli,j,p ~ q, 
it can be shown that [t@er), t(x,;)] is also independent of k,iAk,j Ak. Take 
{si = (eG), tx). Using the basic commutator relations, it may be further verified 


that {sj} respects the Steinberg relations. tt 


Bibliography 


Artin, M.: Algebra. Pearson Education (2008) 

Artin, E.: Galois Theory, New edn. Dover Publication (1998) 

Birkoff, G., MacLane, S.: A Survey of Modern Algebra, 3rd edn. Macmillan, New York (1965) 
Curtis, R.: Representation Theory of Finite Groups and Associative Algebras. New edn, AMS 
(2006) 

Curtis, M.L.: Matrix Groups. Springer (1984) 

Fulton, H.: Representation Theory. GTM, Springer, Berlin (1999) 

Halmos, P.R.: Linear Algebra. UTM, Springer, Berlin (1958) 

Herstein, I.N.: Topics in Algebra, 2nd edn. Wiley, New York (1975) 

Hoffman, Kunze: Linear Algebra, 2nd edn. Prentice-Hall (1998) 

Hungerford, T.W.: Algebra, 8th edn. GTM, Springer, Berlin (2003) 


. Jacobson, N.: Basic Algebra I. II. Freeman, San Francisco (1980) 
. Lang, S.: Algebra, 2nd edn. Addison-Wesley, Boston, MA (1965) 
. Morandi, P.: Field and Galois Theory. GTM, Springer, Berlin (1996) 


Robinson, D.J.S.: A Course in the Theory of Groups, 2nd edn. Springer (1995) 


. Rotman, J.J.: An Introduction to the Theory of Groups, 4th edn. GTM, Springer, Berlin (1999) 
. Saikia, P.: Linear Algebra. Pearson (2009) 

. Serre, J.P.: Linear Representations of Finite Groups. GTM, Springer, Berlin (1996) 

. Suzuki, M.: Group Theory I and II. Springer (1980) 


© Springer Nature Singapore Pte Ltd. 2017 427 
R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 
DOI 10.1007/978-98 1-10-4256-0 


Index 


A 

Abel-Ruffini, 327 

Abelian extension, 311 
Abstract Kernel, 378 

Adjoint of a linear transformation, 117 
Adjoint of a matrix, 133 
Affine lines, 114 

Affine planes, 114 

Algebra, 34 

Algebraic closure, 270, 288 
Algebraic element, 266, 267 
Algebraic extension, 269 
Algebraically closed field, 288 
Algebraically dependent, 266 
Alternating map, 137, 140 
Angle, 103 

Artin-Schreier, 314 
Augmented matrix, 41 


B 

Baer Sum, 389 

Basis, 18 

Bessels inequality, 106 
Bilinear form, 176 

Block multiplication, 39 
Brauer representation, 331 
Burnside, 341, 343 
Burnside Theorem, 359 


Cc 

Cardano solution, 327 
Cauchy-schwarz, 100 
Central extension, 399 
Chain conditions, 229 
Character, 279 


© Springer Nature Singapore Pte Ltd. 2017 


Character afforded by representation, 351 


Characteristic of a field, 6 
Characteristic polynomial, 150 
Characteristic value, 150 
Characteristic vector, 150 
Character ring, 352 
Coefficient matrix, 41 
Co-factor matrix, 133 

Column rank, 43 

Column space, 43 

Companion matrix, 215 
Complete Group, 383 
Composite field extension, 275 
Congruent bilinear forms, 178 
Congruent reduction, 65 
Connecting homomorphism, 400 
Consistent, 41 

Constructible number, 318 
Constructible points, 318 
Coupling, 378 

Cramer’s rule, 134 

Crossed homomorphism, 316 
Cycles, 136 

Cyclic extension, 311 

Cyclic module, 205 
Cyclotomic extension, 311 


D 

Dedekind domain, 250 
Dedekind theorem, 279 
Degree of extension, 265 
Degree of separability, 295 
Determinant of a matrix, 131 
Diagonalisable, 153 
Dimension, 19 

Direct sum of modules, 200 


R. Lal, Algebra 2, Infosys Science Foundation Series in Mathematical Sciences, 


DOI 10.1007/978-98 1-10-4256-0 


429 


430 


Direct sum of spaces, 23 
Direct sum representation, 346 
Divisible group, 247 

Dual basis, 83 

Dual space, 80 


E 

Eigenspace, 155 

Eigenvalue, 150 

Eigenvector, 150 

Elementary matrices, 54 
Elementary operations, 44 
Equivalence of extensions, 370 
Equivalent system, 43 

Euclidean inner product, 98 
Euclidean metric, 102 

Euclidean n-space, 8 

Even permutation, 139 

Exact sequence, 238, 368 
Exponent, 208 

Extension of a group by a group, 368 
Exterior algebra, 258 

Exterior power, 256 

Exterior power representation, 346 


F 

Factor system associated to an extension, 
373 

Factor systems, 373 

Ferrari solution, 328 

Field, 1 

Field extension, 265 

Finitely generated space, 13 

Five lemma, 238 

Five lemma for groups, 370 

Five-term exact sequence, 413 

Fixed field, 276 

Free abelian groups, 205 

Free central extension, 401 

Free module, 203 

Free variable, 45 

Frobenious reciprocity, 362 

Function field, 272 

Fundamental Exact Sequence, 400 

Fundamental theorem of algebra, 308 

Fundamental theorem of Galois theory, 305 

Fundamental theorem of homomorphism, 75 


G 
Galois extension, 276 
Galois group, 276 


Index 


Gaussian elimination, 44 

General linear group, 78 

Geometry of orthogonal transformation, 167 
Gram-Schmidt process, 107 

Grothendieck group, 259 

Group algebra, 336 

Group of rigid motions, 123 


H 

Hermitian linear transformation, 118 
Hermitian matrix, 37 

Hermitian conjugate, 35 

Hilbert Basis Theorem, 232 

Hilbert Satz 90, 317 

Homogeneous system, 42 
Hyperbolic metric, 175 

Hyperplane, 28 

Hyperplane reflection, 168 


I 

Idempotent linear transformation, 94 
Induced character, 361 
Induced representation, 361 
Injective module, 244 

Inner product, 97 

Inner product space, 97 
Inseparable extension, 294 
Invariant subspaces, 150 
Inverse of a matrix, 38 
Irreducible representation, 346 
Isometry, 120 


J 

Jacobson Density, 340 

Jordan block, 217 

Jordan—Chevalley, 220 
Jordan—Chevalley decomposition, 221 


K 

K-automorphism, 276 

Kernel of a linear transformation, 74 
K-isomorphism, 276 


L 

Length of a vector, 100 
Linear combination, 13 
Linear functional, 80 
Linear independence, 14 
Linear representation, 346 


Index 


Linear space, 9 

Linear transformation, 73 
Local ring, 250 

Lorentz Group, 193 

Lorentz inner product, 193 
Lorentz matrix, 193 

Lorentz Transformation, 193 
LU factorization, 58 


M 

Mackey irreducibility criteria, 366 

Maschke Theorem, 336 

Matrices, 31 

Matrix addition, 32 

Matrix multiplication, 34 

Matrix of transformation, 87 

Matrix representation map, 85 

Milnor group, 419, 423 

Minimum polynomial, 214 

Minimum polynomial of a linear transforma- 
tion, 94 

Minkowski space, 8 

Modular representation, 331 

Multilinear map, 139 


N 

Negative bilinear form, 181 
Nilpotent endomorphism, 91 
Noether equation, 316 
Noetherian module, 230 
Noetherian ring, 230 
Non-abelian exterior power, 414 
Nondegenerate bilinear form, 180 
Nonsingular bilinear form, 180 
Nonsingular matrix, 37 

Norm of a field extension, 315 
Normal basis theorem, 309 
Normal closure, 293 

Normal extension, 284, 291 
Normal form, 60 

Normal transformation, 118 

Null space, 42, 74 

Nullity, 42 

Nullity of a linear transformation, 83 
Number field, 250 


O 

Obstruction, 392 
Obstruction map, 394 
Odd permutation, 139 
Ordered basis, 21 


431 


Orthogonal compliment, 115 
Orthogonal group, 110 
Orthogonal matrix, 109 
Orthogonal Projection, 115 
Orthogonal reduction, 186 
Orthogonal sum, 168 
Orthogonality of vectors, 104 
Orthogonality relation, 352, 354 
Orthonormal basis, 106 
Orthonormal set, 106 


P 

Parallelogram Law, 104 

Perfect field, 300 

Period of an element, 206 
Permutation, 135 

Pivot, 45 

Pivot variable, 45 

Polar decomposition, 164 

Positive bilinear form, 181 

Positive definite symmetric matrix, 126 
Prime fields, 5 

Primitive element, 270 

Principal minors, 151 

Projective General Linear Group, 399 
Projective module, 243 

Projective representation, 399 

Proper value, 150 

Proper vector, 150 

Purely inseparable extension, 303 
Pythagoras Theorem, 104 


Q 
Quadratic form, 186 
Quotient modules, 201 


R 

Radical extension, 324 

Rank, 43, 50 

Rank of a bilinear form, 180 

Rank of a linear transformation, 83 
Rational canonical form, 215 
Reduced row echelon form, 45 
Reflection about a hyperplane, 129 
Reflexive, 81 

Restricted Burnside Conjecture, 344 
Rigid motion, 122 

Root system, 129 

Rotation group, 174 

Row rank, 43 

Row space, 43 


432 


S 

Scalar matrix, 36 

Schreier extensions, 370 

Schur, 344 

Schur Lemma, 332 

Schur multiplier, 408 
Schur-Zasenhauss, 390 

Second co-homology group, 386 

Self adjoint, 37, 118 

Semi-direct product, 382 

Semi-simple linear transformation, 153 
Semi-simple module, 334 

Semi-simple ring, 334 

Separable closure, 296 

Separable element, 294 

Separable extension, 284 

Separable polynomial, 294 

Set of generators, 13 

Shortest distance, 112 

Signature of a real symmetric matrix, 184 
Signature of a symmetric bilinear form, 184 
Similar matrices, 88 

Simple extension, 270 

Simple module, 331 

Singular value, 165 

Singular value decomposition, 166 
Skew-Hermitian linear transformation, 118 
Skew-Hermitian matrix, 37 
Skew-symmetric bilinear form, 184 
Skew symmetric matrix, 36 

Solution space, 42 

Space of linear transformation, 79 
Space of matrices, 33 

Spectral theorem, 161 

Spherical n-space, 175 

Split exact sequence, 239 

Split extension, 382 

Splitting, 382 

Splitting field, 285 

Stably isomorphic, 259 

Steinberg group, 423 

Steinberg relations, 55 

Structure theorem of semi-simple ring, 337 
Subfield, 4 

Sub-representation, 346 

Subspace, 11 


Index 


Subspace generated by a set, 13 

Sylvester law, 183 

Symmetric bilinear form, 181 

Symmetric matrix, 36 

Symmetric power of a representation, 346 
System of linear equations, 40 


T 

Tensor algebra, 256 

Tensor product, 251 

Tensor product representation, 346 
Tensors, 256 

Third co-homology group, 394 
Torsion-free module, 197 

Torsion module, 197 

Trace, 151 

Trace form, 95 

Trace of field extension, 315 
Transcendental, 267 

Transpose of a linear transformation, 80 
Transpose of a matrix, 34 
Transpositions, 136 
Transvections, 55 

Triangulable matrix, 157 


U 

Unitary group, 110 

Unitary matrix, 109 

Unitary space, 99 

Unitary transformation, 118 
Universal central extension, 420 


Vv 
Vandermonde matrix, 145 
Vector space, 9 


Ww 
Whitehead group, 262 


Z 
Zelmanov, 344 


